EDUCATIONAL AND PSYCHOLOGICAL 
MEASUREMENT 





Volume IT JULY, 1942 Number 3 





THE EXAMINERS OFFICE OF THE UNIVERSITY SYSTEM OF 
NRE ico ce i ae gs pr aa ga a Re eg ae sae Ed 233 
F. S. Beers 


LEVELS OF COMPETENCE IN COUNSELING—A Post-War Pros- 
LEM FOR STUDENT PERSONNEL WorK IN SECONDARY SCHOOLS. . 243 
Milton E. Hahn 
A Srupy or Some Locat Factors AFFECTING STUDENTS’ SCORES 
ON THE MINNESOTA PERSONALITY SCALE...........00e0000- 257 
Betty M. Horne and WV. C. McCall 


Tue Piace oF APTITUDE TESTING IN THE PUBLIC SCHOOLS..... 267 
Donald E. Super 
EFFECT OF ENGINEER SCHOOL TRAINING ON THE SURFACE DE- 
ef EEE EOE OLE OI LE PE NEN Pr ht hee 279 
Ruth D. Churchill, Jeanne M. Curtis, Clyde H. Coombs, 
and Thomas W. Harrell 
iM AID TO STUDENT COUNSHEORS. «cc 6 ndss ccc csc ns cemes Seis 281 - 
Ralph F. Berdie 
A CoMPARISON OF THE HUMAN BEHAVIOR INVENTORY WITH Two 
OrHer PERSONALITY INVENTORIES. .......ccccccccccscccvcs 291 
Abraham Sperling 


InTRA-INDIVIDUAL DiFFERENCES Versus INTER-INDIVIDUAL DiIF- 
FERENCES IN Moror SKILLS 
William A. Owens, Jr. 


New TEstTs 


MEASUREMENT ABSTRACTS 








Copyright, 1942, by 
SCIENCE RESEARCH ASSOCIATES 


PRINTED IN THE UNITED STATES OF AMERICA 


oo OU 


Nye 














re 





—Ar 











THE EXAMINERS OFFICE OF 
THE UNIVERSITY SYSTEM OF GEORGIA 


F. S. BEERS 


Social Security Board 


HE UNIVERSITY SYSTEM OF GEORGIA is unique 
if igovnte the states. It is a centrally administered, govern- 
mentally supported organization of 15 colleges now complet- 
ing its first decade. Whether state-supported higher education 
so conceived and so administered can and should endure is a 
question which is fittingly being tried out, as it were, in “the 
oldest chartered state university’ and its branches. 

Before 1931 there were 25 state-supported colleges in 
Georgia, with a grand total of 365 college trustees. Each col- 
lege operated as a unit, appealed to the legislature for financial 
support in competition with the other colleges, arranged its 
curriculum and its administration as it saw fit, and ordered its 
affairs to please itself. The older and stronger of the colleges 
used as their chief defensive weapon a policy of paring down 
or reducing in value the credits earned at the younger and 
weaker colleges, thus discouraging enrollment at these institu- 
tions and exacting tribute of students who transferred from 
them. 

In one stroke the Reorganization Act of 1931 swept this 
scramble into the discard. Ten colleges were abolished,’ a 
single Board of Regents replaced the 365 local trustees, and a 
chancellor was set up as chief administrative officer. In the 
Chancellor and the Board of Regents was vested the authority 





1The colleges surviving the reorganization were: The University of 
Georgia with its School of Medicine, The Georgia School of Technology, two 
senior colleges for women, one college for teachers, seven junior colleges, and 
three colleges for Negroes. The average annual enrollment in regular session 
is about 12,000 students. 


233 








EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


for setting the educational and financial policies of the system 
of colleges and for reviewing the activities of colleges indi- 
vidually, as they might add to or detract from the effectiveness 
of service to the state. 

As part of the reorganization it was recommended that 
“at an early date there should be added to the Chancellor’s 
office an. . . officer properly trained in educational and sta- 
tistical techniques [who] should be charged, under the super- 
vision of the Chancellor, with the necessary duty of assem- 
bling, analyzing, and interpreting the regular and special re- 
ports of the operations of the several branches so as to make 
continually available in proper form for the Board of Regents 
that general information and other specific data upon which 
the Board may base its actions.” 

An office for this purpose was established in 1934 by order 
of the Regents and was located at the University,” that being 
considered the hub of the academic wheel whose circumference 
is the state system of higher education. The Regents wisely 
provided this new office with the nucleus of a bureau of stand- 
ards against which educational accomplishments and experi- 
ments could be measured and from which administrative poli- 
cies of individual colleges could be, directly or indirectly, 
evaluated. 

This provision included a basic curricular pattern of ten 
courses representative of general education, which was re- 
quired in all the colleges, the content of the courses having 
been determined by the faculties of the colleges in a series of 
conferences. 

To provide for the effective administration of these 
courses, information for their frequent revision, and a guar- 
antee that equal achievement on the part of students regard- 
less of college should be given equal credit, with right of trans- 
fer of credits without let or hindrance, the Regents authorized 
state-wide examinations on these courses and common inter- 
pretation of scores made by students taking them. 





“The University is in Athens; the Chancellor's office is in the State Capitol, 
Atlanta. 


234 





g 
fi 
h 


) 


ee ee ee oe ee on a 


—_— er 






































EXAMINERS OFFICE OF GEORGIA SYSTEM 


n | Administrative responsibility for this policy was assigned 
. a by the Board to the new office it had set up, which later, by 
s general acceptance, came to be known as the “Examiners Of- 
fice.” Supervising and administering course examinations, 
t however, were intended to be no more than partial bases of 
. a operations for more important duties and obligations of the 
* ) office. 
: 2 The primary bureau of standards, consisting of 10 survey 
- courses generally required, was augmented by authority to 
_— make use of a variety of devices for gauging the effectiveness 
of the program, among them measures of the relative quality 
of students electing to attend and those not electing to attend 
; 4 college, together with ways of improving selection; analysis 
} of the physical well-being of students and its relation to men- 
tal acuity; evaluation of the amount of cultural background, 
: skill, and intellectual power that the college environment pro- 
. vides, with applications to the individual problems of students 
through educational and vocational guidance. 

From the general framework and techniques of analysis 
that are employed in the attack upon these problems have 
, come numerous adaptations that provide partial, and often 
rather full, answers to such questions as the relative cost to 
the state of general and special education, the relative ef- 
fectiveness of each as judged by administrators, as observed 
’ in student opinion and experience, and as measured against 
outside criteria; optimum size of class enrollments; whether 
education conceived and practiced as a purely personal matter 
between instructor and student tends to crystalize and crum- 
ble more or less rapidly than when it is broadly administered 
i and variously supervised as, for example, under central as 
against local administration, or under divisional as against 
departmental jurisdiction. Nearly all of these problems re- 
volve on the one hand around individual college prerogatives, 
and on the other around obligations of the colleges, collect- 
ively, to the state. 

How successful has this venture toward a university system 
proved itself to be? 


\¥ 


we 








EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


It is perhaps too early, after the lapse of approximately 
a decade only, to judge fairly whether the experimental at- 
tempt to establish the University System and maintain it 
through a research program will ultimately make a positive 
contribution to policy in higher education generally, as the 
founding of the first State University did so well over a cen- 
tury and a half ago. 

Of the successes or failures, those in administration are, of 
course, the most difficult to appraise. Any issues that point 
up a central administration as superseding local or college 
jurisdictions are bound to generate an appreciable amount of 
heat and to invite emphatic assertion of “states’ rights.” 
Hence, it is to be expected that patterns of incandescent light 
will flicker frequently, as they have, among the colleges of 
Georgia and will wax in strength within the faculties of indi- 
vidual colleges and departments as well. Attention must thus 
inevitably be divided, often on very plausible and sincerely 
held grounds, between these things and many admirable ac- 
complishments of central administration such as are annually 
cited by the Chancellor in his report to the Governor. 

But in the less controversial sphere of service and research, 
it can be claimed with confidence that many of the techniques 
employed in the experiment toward a university system have 
not only served their immediate purpose well but also have 
proved useful in helping to set administrative policies. 

Course examinations in the basic curriculum have been 
the key to assembling data on the effectiveness of the educa- 
tional program. These courses are set up on a five-hour, 
quarterly plan. There are three in the social studies, two 
each in the biological and physical sciences, two in humanities, 
and one in mathematics. Two in elementary English were 
added to this group for examination purposes in 1936; and 
two in chemistry in 1940. Since the war emergency, addi- 
tional courses in mathematics and physics are also being in- 
cluded for the men students. 

Quarterly from 500 to 2,500 person-examinations are ad- 
ministered in each of the basic courses, with an average per 


236 


i 


ee eee 





ao we 


— - 





SS ee ec es 


EXAMINERS OFFICE OF GEORGIA SYSTEM 


course approximating 1,000. Over the period of a year about 

200 teachers take part in the instructional and examining 

programs. 

Administrative procedures for these programs are de- 
signed to furnish a framework within which the individual 
talents of teachers not only may be protected but also may be 
given direction. These procedures may be summarized as 
follows: 

1. Conferences and committees to formulate the aims of 
courses and the content-outline of examinations, and 
to consider the limitations imposed upon the objectives 
of examining by the average student to be served and 
by the extent of variability from this average of other 
students who likewise are to be judged on examina- 
tion results; 

2. Participation by teachers in these general formulations, 
in the types of questions or tasks that will be required 
of persons to be examined, and in the final selection 
of materials to be included in an examination; 

3. Definite and regular assignments in question making 
for inclusion in examinations, with complete encour- 
agement to offer innovations with respect to both form 
and content; and 

4. An office for analysis, research, and collation of the 
examining function with respect to construction, ad- 
ministration, and interpretation. 


Individual and group conferences are used extensively for 
the purpose of aiding members of the teaching staff in the 
preparation and improvement of examination questions. Item 
analyses are placed at the disposal of committees and instruc- 
tors, and reports, digests, mimeographed material, and the 
like are made available for the information of staff members.* 

Scaling of examination results is done by the Examiners 
Office, with the advice of Divisional Heads and after periodic 
canvassing of faculty opinion. Final grades in course work 
are assigned by individual teachers by means of a type of 





3For example, F. S. Beers and others, Some Principles of Examining, with 
Aids for Consulting Examiners (University of Georgia Press, 1942), 45 pages. 


237 











EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


transmutation to letter grades of the average ranks on class 
work assembled by teachers and on scores on the final exami- 
nation.* Final grades are comparable from college to college 
and among the basic courses. [his feature is imperative if the 
transfer of credits and students is to be effected on a legiti- 
mate basis; and in a university system it should take prece- 
dence over the more common practice in colleges of subordi- 
nating the exainining function to the much vaunted “objectives 
of instruction.’ It is apparent, however, that the two points 
of view need not be mutually exclusive. Those who make such 
a claim tend, perhaps, to think more with their bile than with 
their brains. 

Machine scoring of final examinations is done by the staff 
of the Examiners Office. From 9,000 to 13,000 answer sheets 
are scored quarterly in a period not exceeding four days. The 
method used is unique. Each college alphabetizes its answer 
sheets immediately after an examination, prepares an alpha- 
betical list of student names in duplicate, packages both, and 
expresses or brings the packages to the central office. Here 
the procedure is as follows: 


1. The duplicate list of names is inserted in a typewriter 
set up adjacent to the scoring machine, and a typist 
is put in charge to record scores; 


2. A tally clerk equipped with printed forms is located 
in front of the scoring machine facing the operator; 


3. The scoring machine operator calls each name and 


score (part or whole as the case may be) but does not 
record it on the answer sheet 


4. The typist and tally clerk record the scores on their 
respective forms from the “call’’ of the machine op- 





erator; 

5. A calculating machine operator summates the tallies 
by sections and colleges and runs the scale on the total 
distribution for the State; 

4See F. S. Beers and H. M. Cox, “Measurement or Marking?” Journal of 


the pol lg Association of Collegiate Registrars (April, 1938). 


238 





0) 


~ — © 





t 
j 
f 
' 











EXAMINERS OFFICE OF GEORGIA SYSTEM 


7 


6. Names, raw scores, and the scale for transmuting 
scores into grades are put in an envelope and mailed 
special delivery to each college dean. 

On the average, the total possible score per examination 


is 150 points, although some tests may have a possible total 
of more than double this figure. Rescoring has shown small 
errors, +1 point, to be characteristic of about 5 per cent of 
answer sheets. Only very occasionally do large errors occur. 
As a check on these, deans are instructed to call for rescoring 
whenever a student's displacement in rank between the ex- 
amination score and his class work exceeds one letter grade 
or whenever an instructor makes a request for rescoring. 

Item analyses of questions used in the examinations on 
the basic courses prove extremely valuable in the selection of 
items for subsequent inclusion in freshman placement and 
sophomore comprehensive tests. Each year a battery of such 
tests is constructed, covering general education. The parts 
are divided so that approximately equal weight is given to 
scientific and verbal skill, roughly paralleling the Q and L 
scores for the American Council Psychological Examination. 
Repeated samples on students taking both the A.C.E. and the 
Southeastern Aptitude Examinations yields coefficients of cor- 
relation with a median value of .90. 

The Southeastern Aptitude Examinations are constructed 
in April of each year, are first administered to sophomores as 
‘comprehensives,”’ and in the following fall are given to fresh- 
men as placement tests. Statistical comparability for succes- 
sive editions is based on the assumption that the freshman 
and sophomore populations are substantially the same in abil- 
ity and achievement from year to year. This assumption is 
checked periodically by means of sampling with the same form 
of the A.C.E. 

The framework of placement and end-of-the-year sopho- 
more testing supplies a valuable reference for numerous 
studies. Relative gains over the first two years in college by 
fields may thus be estimated; and the results, when placed 
at the disposal of committees on course content, have been 


239 





EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


found useful for revision purposes. The general or com- 
posite indices on the tests for freshmen supply ‘“‘expectancies”’ 
that have been valuable in reforming grading practices in the 
non-basic or pre-professional curricula, in which marking has 
been found to be, on the average, a letter grade higher than 
in the basic courses, besides being for the most part extremely 
unreliable. Placement and sophomore test scores likewise 
make possible predictive studies of general ability in relation 
to achievement in the basic courses, where grading is rela- 
tively reliable and comparable from college to college. The 
part scores as well as the general score index may also be 
used for similar inquiries. 

Coupled with the placement testings are centrally admin- 
istered physical examinations. Medical staff officers and med- 
ical college seniors give their services for this purpose. The 
examination blank is set up for Hollerith tabulation and in- 
cludes, besides quantification of clinical findings, a socio-eco- 
nomic scale and an index of emotional stability. Tabulation 
of the data makes possible, together with the “paper and pen- 
cil” testing, a variety of studies bearing on the physical and 
mental development of students coming from many different 
types of environment. 

Surveys of student opinion of college work have been 
demonstrated as worth while in shedding light upon the ef- 
fectiveness of educational practices and in comparing, from 
this point of view, the relative quality, difficulty, and popular- 
ity of the basic and pre-professional curricula. The setting 
is especially favorable to useful measures of student opinion, 
since approximately half of the curriculum at the junior col- 
lege level is composed of basic courses common to all students 
and half of pre-professional or vocational courses.* 


All examinations, forms, questionnaires and the like are 
prepared centrally by the photo-offset process. Collectively, 
the examinations of all kinds for a single year approximate 





5“Student Opinion of College Courses, 1937 and 1940,” Examiners Office 
Bulletin, September, 1940, Uinversity of Georgia Press. 


240 


| 
| 





aed —— 


ee 


Se 





po 
po 
av 
do 
pr 


sec 
an 


Re 








— 7 


_——_—_—=—_ SS a 











OF 





EXAMINERS OFFICE GEORGIA SYSTEM 





200,000 copies. About 35 per cent of these are used outside 
of the University System, by colleges and high schools in the 
Southeast. 

All data from examinations, periodic and occasional re- 
ports and studies, and general conclusions about educational 
policy growing out of the services and research are made 
available through conferences, correspondence, and formal 
documents to the Chancellor, the Board of Regents, and the 
presidents and faculties of the 15 colleges of the System. A 
University System Council of which the Examiner is executive 
secretary formulates the educational policies for the System 
and recommends its findings to the Chancellor and Board of 
Regents for action. 








Sutin 


~~. — -_ ——- _ 


| 








— 












LEVELS OF COMPETENCE IN COUNSELING—A 
POST-WAR PROBLEM FOR STUDENT PER- 
SONNEL WORK IN SECONDARY SCHOOLS 


MILTON E. HAHN 
University of Minnesota 


ANY THOUGHTFUL secondary school administra- 
M tors are deeply concerned with the readjustment prob- 
lems which will face the United States and its educational in- 
stitutions after the present world conflict. The depression 
years of the past decade gave a fore-taste of the services 
which will be demanded of schools and their personnel work- 
ers in a post-war world. The inadequacy of student person- 
nel work between 1929 and 1940 was brought sharply home 
to our high schools by the creation of new governmental 
agencies which were established to compensate for the short- 
comings of public education. In many communities the first 
professional guidance services for youth were introduced by 
the National Youth Administration, the Civilian Conserva- 
tion Corps, the Work Projects Administration or the Fed- 
eral-State Employment Service. The work of these agencies, 
coupled with the relatively careful man-job analyses being 
made by the personnel divisions of the armed services, raises 
a serious question as to whether or not the public will accept 
traditional hit-or-miss methods in preparing post-war youth 
for meeting its responsibilities. School administrators face 
conditions which demand constructive action if their institu- 
tions are to retain the high public esteem and financial support 
they have enjoyed in the past. 

Student personnel work is a relatively new educational 
configuration in secondary schools. For two decades begin- 


243 











EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


ning about 1909, the major emphasis was upon treating indi- 
viduals and their particular complex problem patterns with 
group methods, paralleled during the second decade by the use 
of tests in college and industry. The third decade of the move- 
ment was devoted to a search for tools and techniques more 
valid and reliable than the lecture and casual conferences be- 
tween a student and a teacher. This decade contributed much 
to the methodology of job analysis. It was also marked by 
the flowing together of these two movements. With the emer- 
gence of better tools and techniques for man analysis, compe- 
tent general counselors began to be trained and uiilized in 
colleges, universities, and large secondary schools. During 
the 1930-40 decade the leaven of professionally trained stu- 
dent personnel workers spread unevenly over the country into 
small colleges, junior colleges, and high schools enrolling 
less than 500 students. 


This decade also was marked by attempts to define and de- 
scribe student personnel work.’ The older term, guidance, 
had, because of disputes as to its nature, become more and 
more meaningless. Various schools of thought stretched 
“guidance’”’ to mean vocational guidance only,? to be a syn- 
onym for education,*® and to cover the ordinary non-lecture 
activities of classroom teachers alone. There is no present 
definition of either guidance or personnel work which is gen- 
erally accepted by all workers with youth problems. The 
matter of definition is of interest to us here only because it is 
necessary to limit the scope of our materials. Personnel work 
with secondary school students must be broadened in scope to 
include responsibilities in certain directions for out-of-school 





1The reader interested in the development of student personnel work is 
referred to the following sources: W. H. Cowley, “The Nature of Student 
Personnel Work,” The Educational Record, April, 1936; George E. Myers, “The 
Nature and Scope of Personnel Work,” The Harvard Educational Review, 
January, 1938; Donald G. Paterson, “The Genesis of Modern Guidance,” The 
Educational Record, January, 1938. 

2H. D. Kitson, “Getting Rid of a Piece of Educational Rubbish.” Teachers 
College Record, XXXVI (October, 1934). 

8Tohn M. Brewer, Education is Guidance (New York: Macmillan, 1932). 

4J. E. Walters, Individualizing Education (New York: John Wiley & Sons, 
1935). 


244 





_-— TSS 





as 


be 


ol 


in 


le 








—_—~+_ —_[, — 


LEVELS OF COMPETENCE IN COUNSELING 


youth. Therefore the following working definition is offered 
as a frame of reference for this article. 

Personnel work with youth is the marshalling, under the 
best obtainable professional leadership, of educational and 
other community resources to aid individual youth, in and out 
of school, to help themselves toward optimal resolutions of 
immediate and long-range problems in the various life-prob- 
lem areas. 

Because the average community is small and because the 
majority of workers with youth are found in the schools, the 
personnel program will tend to center about the secondary 
school for all community youth. Although this will be a 
typical situation, it will be necessary for the educational per- 
sonnel worker employed in the educational system to refer 
many problems to other professional workers in the geo- 
graphic or political district. At what point does the person- 
nel worker in our schools face the necessity for referral of 
cases? A partial answer can be obtained through considera- 
tion of arbitrarily selected categories of personnel workers 
and the estimated competence for the average individual in 
each category relative to counseling effectiveness. Such an 
approach requires consideration of many variables and com- 
plicates verbal presentation. Again the writer exercises his 
prerogative of being arbitrary and for the sake of simplifica- 
tion selects the variables to be introduced. We shall consider 
teacher-counselors, vocational specialists, and clinical coun- 
selors as the categories of youth personnel workers. Life- 
problem areas will be represented by vocational problems and 
educational problems. Levels of case history interpretation 
and use of tools and techniques of the counselor are selected 
as the axes upon which our worker categories and life-prob- 
lem areas will be considered. 

Life-Problem Areas 

Personnel work with individuals is necessary because they 
have problems which they are unable to resolve satisfactorily 
unaided. These problems occur in tangled patterns in which 
it is frequently impossible to separate one general kind of 


245 








EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


problem from another or show clearly which is cause and 
which is effect. Because it is impossible adequately to verbal- 
ize a whole problem pattern, we must discuss the interrelated 
problems as if they were discrete phenomena. A commonly- 
used categorization of life-problems includes vocational, edu- 
cational, personal, health, and financial adjustments. For our 
purposes we consider only the first two. 

Vocational problems are those in which lack of adjustment 
is caused by poor choices of vocational field or level, no choice, 
or uncertainty of choice with the need for competent assur- 
ance or advice. Vocational problems may be considered in 
some aspects as phases of educational problems. Although 
vocational problems often are treated as if they were simple 
in nature, they are extremely complicated in many individuals. 
Sound vocational counseling requires that the counselor be 
familiar with the theory and clinical usage of the psychologi- 
cal concepts of abilities, aptitudes, and interests. Because 
of this, reliance upon untrained counselors and self-analysis 
has been discarded by the best practitioners. The case against 
these traditional methods has been stated ad nauseum, but 
these methods are still utilized in many secondary school guid- 
ance programs. A recent study by Stone’ presents further 
reasons for questioning present common treatment of voca- 
tional problems in adolescents. 

Student educational problems are those caused by being 
thwarted in whole or in part in the attempt to proceed through 
a training program (usually formal) toward a goal. For 


many youth this goal is occupational in nature. As has al- 


ready been said, vocational and educational problems are very 
frequently different aspects of the same general condition. 
Educational problems like others range from the very simple, 
such as a choice between afternoon or morning classes, to very 
complex, such as a complicated reading disability requiring 
special remedial work. If possible, we have placed even 





5C. Harold Stone, “Evaluation Program in Vocational Orientation.” Studies 
in Higher Education, Biennial Report of the Committee on Educational Re- 
search (Minneapolis: University of Minnesota Press, 1938-1940), pp. 131-145. 


246 


| 
| 
| 





sc 
nt 


0} 


in 























LEVELS OF COMPETENCE IN COUNSELING 


greater reliance upon self-analysis for resolution of educa- 
tional problems than has been true of other kinds of problems. 
The tragic results of past treatment of the educational prob- 
lems of youth fill the literature. A pointed commentary on 
our educational counseling is found in the New York Regents 
Study.® 


The Teachers’ Level of Counseling Competence 


In the student personnel programs of many secondary 
schools the teacher is the personnel worker. There are a 
number of reasons why this condition exists. The most im- 
portant factor contributing to such programs is the concept 
of guidance held by so many secondary school administrators. 
To believe that teachers trained chiefly for classroom teach- 
ing can deal adequately with the serious problems of youth 
implies that the follower of this creed also believes that: 


Student self-analysis has high validity and reliability. 
Student problems are seldom serious. 

Professional workers are not needed. 

Teachers have enough free time to know each student 
intimately and discharge counseling responsibilities. 
Tools and techniques beyond interviews and school 
grades and their interpretation are not worth employ- 
ing or are quickly learned by classroom teachers. 


The weaknesses of the teacher-counselor type of program 
in which many or all teachers consult on all kinds of student 
problems are manifold. A chief weakness of this type of 
program is the narrow range in which counseling competence 
exists. ol 


The outline illustrates this narrow range of counseling 
competence. It presents crude continua of data interpretation 
for two problem categories—vocational and educational. It is 
relatively safe to assume that in neither of these continua does 





6Francis T. Spaulding, High School and Life, The Regents’ Inquiry (New 
York: McGraw-Hill, 1938). 


247 











EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


the classroom teacher compare favorably with counselors 
trained to interpret complex interrelated data. 
OUTLINE OF 
LEVELS OF COUNSELING 


INTERPRETATION TO STUDENTS OF EDUCATIONAL PROBLEMS 


Unvalidated Grades- Simple Sophisticated Pattern 
Personal Ratings- Statistical Statistical Analysis of 
Impressions Student Treatment of Treatment of Individual 
Choices Measurement Measurement from 
Synthesized 
Data 


INTERPRETATION TO STUDENTS OF VOCATIONAL PROBLEMS 


Unvalidated Occupational Relation Occupa- Simple Sophisticated Pattern 
Student Application of of School _ tional Statistical Statistical Analysis of 
Choices Specific Sub- Subjectsto Infor- Treatment Treatment Individual 
ject Matter World of mation of Meas- of Meas- from 
Work urement urement Synthesized 
Data 


Teacher Counseling and Educational Problems.—In 
the area of educational problems the average teacher works 
for the most part with unvalidated student statements or with 
equally invalid personal impressions. Many teachers interpret 
their grades and ratings to youths seeking counsel. A few 
teachers have become skilled to an extent that they can inter- 
pret various kinds of data in terms of simple statistical con- 
cepts. A very few are competent to interpret complex, related 
data, dependent for its meaning upon great statistical sophisti- 
cation. A rare individual can utilize job and man analyses 
in such a way that proper interpretation is supplied the coun- 
selee. Ineffective educational counseling by secondary school 
teachers is not a matter of speculation or assumption. Eckert 
and Marshall’ state that more than three of five high-school 
students in New York State leave school before graduation. 
Many of those leaving school do so because of inability to 
meet the demands of the curricula in which they attempt to 
compete. Many of these students could profit from courses 
of study different from the ones in which they failed. 





7Ruth E. Eckert and Thomas O. Marshall, When Youth Leave School, 
The Regents’ Inquiry (New York: McGraw-Hill, 1938), pp. 48-49. 


248 





—_— ~~ Fs PO Lew 


ee ee 


a 2a 2s eee (fee 





of 
al 


red 


— 


LEVELS OF COMPETENCE IN COUNSELING 


Edgerton and Toops* estimate that only 34 per cent of 
1,958 students included in a survey of Ohio college students 
had records indicating ability for eventual college graduation. 
Many of these students must have been advised by high-school 
teachers and administrators to attempt higher education at the 
college level. The literature of secondary school and college 
mortality points clearly to a large amount of poor advising 
about educational problems. Williamson® offers a satisfac- 
tory summarization of educational counseling by teachers 
when he says: 

Fewer students would select inappropriate courses if re- 


liable statements of requirements of the wide variety of occu- 
pations and professions open to high-school graduates were 


available. ... The counseling use of such information would 
enable more students to prepare for appropriate occupational 
goals. 


Many scholastic failures could be avoided if administrators 
and teachers would establish comparable and valid standards, 
so that students and counselors could better judge of future 
success in a course by past achievements in a related course. 


Teacher Counseling and Vocational Problems.—Because 
educational and vocational problems of youth are so closely 
related, much that has been said of teacher counseling and 
educational problems can also be said of vocational problems. 
The effects of poor vocational counseling upon the boy or girl 
are often more serious than those of poor educational coun- 
seling. If a student takes a course for which he is not suited, 
adjustments can be made through failure or change of course. 
These adjustments do not require long time spans. Poor 
vocational counseling can result in situations where the unad- 
justed individual may be forced by circumstances to spend 
long periods in which change is difficult or impossible. One 
has only to inspect the occupational choices of high-school 





8Harold A. Edgerton and Herbert A. Toops, “Academic Progress,” Con- 
tributions in Administration 1 (Columbus, Ohio: Ohio State University Press, 


1929), p. 136. 
9E. G. Williamson, How to Counsel Students (New York: McGraw-Hill, 


1939), pp. 260-261. 


249 








EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


seniors to realize the unrealistic thinking which can supply 
fertile ground for poor advising. Stone’® demonstrates that 
in one freshman group at the University of Minnesota only 
40 per cent of the students had occupational goals which were 
judged valid by competent clinical psychologists. A goodly 
proportion of these students came from high schools which 
have prided themselves upon their teacher counseling pro- 
grams over a long number of years. Students from these 
schools had no better choices than those from schools which 
made no pretensions to vocational counseling. Stone’s study 
indicates that counseling by professionally trained, clinical 
counselors reduces the number of poor vocational choices and 
increases the number of good choices. The gains are sta- 
tistically significant. Williamson and Bordin'™ found that 
control-group (uncounseled) college students achieved a voca- 
tional adjustment judged to be satisfactory by themselves and 
the evaluating judges in 68 per cent of the cases. On the other 
hand, such an adjustment was achieved by 81 per cent of the 
cases in the experimental group (counseled by clinical coun- 
selors). Satisfactory adjustment was not made by 27 per 
cent of the control group and 15 per cent of the experimental 
group. These differences are statistically significant. 

Large numbers of high schools depend upon classroom 
teaching of occupational information to resolve vocational 
problems of students. This faith in “talking at” students has 
little to recommend it. Many of the studies which show ad- 
vantage to classroom group-counseling do so only in terms of 
gains in amount of occupational information. No one has 
produced evidence that students with the greatest amount of 
occupational information make the best vocational choices. It 
is obvious that a student who takes a course in any field of 
knowledge should know more about it than the student who 
has not had the same or similar courses. 

Many useful tools and techniques of counseling have been 





10C, Harold Stone, op cit. 
11E. G. Williamson and E. S. Bordin, “Evaluating Counseling by Means 
of a Control-group Experiment,” School and Society, LIT (1940), 434-440. 


250 





HR nan as = -=- FS CP wm 


, 


a. =—— = -o 











LEVELS OF COMPETENCE IN COUNSELING 


originated or improved in the past decade. Teacher-coun- 
selors are seldom trained to collect and collate data originat- 
ing from these instruments and methods. They cannot be 
expected to be both teachers and applied psychologists. If 
teachers become clinical counselors, they no longer are class- 
room teachers. We need not expect adequate counseling in 
regard to the vocational problems of youth until our schools 
make use of persons other than classroom teachers to assume 
and discharge at least supervisory responsibilities for coun- 
seling. As will be stated later, this does not mean that each 
small school unit must or should have a professionally trained 
counselor or close up shop. 

Referral to the outline on page 248 indicates that rela- 
tively few teachers are so trained that they can interpret data 
to students adequately if such data involve more than the 
presentation of information of a simple nature. Sound job 
analysis by teachers offers serious difficulties. Valid man anal- 
ysis is beyond the ken of the average teacher. 


The Vocational Specialist’s Level of Counseling Competence 


The vocational specialist appeared on the personnel work 
scene in the second decade of the developing secondary school 
personnel work movement. A growing public sense of need 
to meet the pressing problems of youth forced educators to 
take cognizance of these problems. The movement was first 
directed toward emphasis upon “things to be done’’ rather 
than toward “men and women who do things’’—job analysis, 
not man analysis. This trend was clearly reflected in the 
proposed qualifications for counselors which appeared in the 
literature of that period. Myers,’* for example, wrote: 

It is well to remind ourselves, however, that among the 
qualifications, aside from special training, which those who 
select counselors often emphasize are: (1) a personality 


which attracts and gets on well with adolescents; (2) sufh- 
cient maturity to command the respect of pupils and fellow 





12George E. Myers, “A Training Program for Counselors,” Vocational 
Guidance Magazine, p. 315 V (1927). 


251 








EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


teachers; (3) at least as good a general education as is pos- 
sessed by the average high school teacher, usually represented 
by graduation from a college in good standing; (4) success- 
ful experience as a teacher; and (5) preferably, some business 
or industrial experience. (Italics not in original.) 

The Committee on Standard Certification of Vocational 
Counselors, a committee of the National Vocational Guidance 
Association, advocated the following college courses for per- 
manent certification :” 

1. General courses—the usual courses required of candi- 
dates for teaching certificates: educational psychology, prin- 
ciples of teaching, educational measurements, sociology, eco- 
nomics. 

2. Related courses — principles and problems of voca- 
tional education, industrial history, labor problems. 

3. Guidance courses — principles and problems of voca- 
tional guidance, analysis of vocational activities, methods of 
imparting occupational information, psychological tests in 
guidance, counseling the individual, placement and follow-up, 
and field work in guidance. 

Inspection of these training programs indicates clearly the 
emphasis placed upon the job and the worker’s relationships 
to it. Adequate tools and techniques had not as yet been dis- 
covered to analyze and treat the individual as some one to 
whom a particular set of duties could be fitted. Practice was 
to treat the job as a set of duties to which a man or woman 
must be fitted. Vocational specialists dealt with vocational 
problems, i.e., job specifications, occupational trends, labor 
problems, how to get a job, placement, and follow-up. Voca- 
tional aspects of total adjustment were considered so impor- 
tant that other aspects of human adjustment were often over- 
looked or left to be treated by specialists not yet found in 
secondary schools. 

The unfortunate feature of the era of vocational special- 
ists is not that personnel work passed through the stage, but 
rather that the stage has persisted. Too many vocational 





13Leonard V. Koos, and Grayson N. Kefauver, Guidance in Secondary 
Schools (New York: The Macmillan Company, 1932), pp. 569-673. 


252 





To cr WwW 


~ =~ - wr 


a tes oo - 











LEVELS OF COMPETENCE IN COUNSELING 


specialists continue to think of personnel work as job descrip- 
tion and placement long after educational leaders have rele- 
gated their particular contribution to an important but sub- 
sidiary position in the field. 

Vocational interests are no longer what boys and girls say 
they want to do (usually stated as a job label). Vocational 
interests are analyzed today by considering youths’ claims in 
the light of observed behavior over a period of time and 
the leads furnished by various technical psychological meas- 
uring instruments. Selection of a career is no longer in terms 
of whether or not an individual can do a job. Rather the 
question is raised regarding what family” of jobs and at what 
level within this family the optimal vocational adjustment will 
tend to occur. We are not so prone to encourage an ado- 
lescent in a secondary school to be a doctor of medicine. We 
tend to direct to “‘scientific fields at the professional level.” 

Evidence of failure to meet the vocational problems of 
youth through the services of vocational specialists is abun- 
dant in the literature. Anderson” made a strong case for 
psychiatric services in industry. The conditions cited by An- 
derson raise questions as to the ways in which the men and 
women studied were guided to their occupational niches. 
Fisher and Hanna" also contributed evidence that an alarm- 
ing number of workers were not being directed or helped into 
suitable careers. There is little evidence to show that the 
worker who had the help of the vocational specialist made 
significantly better occupational choices than his non-counseled 
brother. 

Reference to the outline leads one to suspect that the vo- 
cational specialist has been handicapped in his work by his 
general inability to deal with man analysis. His contribution 
to personnel work, however, has not been small. The shift 





14Donald G. Paterson, Clayton d’A. Gerken, and M. E. Hahn, Minnesota 
Occupational Rating Scales (Chicago: Science Research Associates, 1941). 

15V, V. Anderson, Psychiatry in Industry (New York: Harper and 
Brothers, 1929). 

16V, E. Fisher and Joseph V. Hanna, The Dissatisfied Worker (New York: 
Macmillan, 1931). 


253 














EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


to emphasis upon men rather than jobs was hastened appre- 
ciably by his efforts. Nevertheless, general responsibility for 
counseling of youth can hardly be delegated to the vocational 
specialist. Except for a minor specialty, he is no more compe- 
tent than the teacher-counselor. 


The Clinical Counselor's Level of Counseling Competence 


The clinical counselor is a highly trained, widely experi- 
enced, applied psychologist. His training has been directed 
primarily to an understanding of people both as unique indi- 
viduals and as members of various groups. He is a specialist 
in one or more areas of human problems. At the same time 
he is a generalist, albeit with knowledge of his limitations. 
The place of the clinical counselor in the field of personnel 
work is well stated by Williamson” when he says: 


While clinical counseling is only one of several specialized 
fields dealing with personal problems, we maintain, however, 
that it is the basic type of personnel work with individual stu- 
dents and serves to coordinate and focus the findings and 
efforts of other types of workers. 


Paterson, Schneidler, and Williamson’ contend that per- 
sons training to be clinical counselors should complete the 
Master’s degree in psychometrics or its equivalent. Further 
they consider the Ph.D. degree or its equivalent in techno- 
logical psychology as desirable. The counselor so trained 
should be a master of the tools and techniques that fill the 
competent counselor’s kit. The clinical counselor should be 
competent in the full range of data interpretation found in 
the outline. 

It is not our present purpose to analyze the competency 
and functions of the counselor. It is safe to assume that such 
a one is, at the present time, our best trained personnel worker 





17E. G. Williamson, How to Counsel Students (New York: McGraw-Hill, 
1939), p. 36. 

18Donald G. Paterson, G. Schneidler, and E. G. Williamson, Student Guid- 
ance Techniques (New York: McGraw-Hill, 1938), pp. 302-303. 


254 











LEVELS OF COMPETENCE IN COUNSELING 


with the vocational-educational problems of youth. Evidence 
has been marshalled which indicates that counselors at this 
high level of competence do achieve appreciably better out- 
comes of counseling than is true of other individuals working 
with vocational-educational problems of youth. It is our pur- 
pose to urge that we make use of these people even in small 
school systems which cannot add them to their full-time staffs. 

The average American high school is small. Sound school- 
community personnel programs must include the pooling of 
resources in order that these small schools can have many 
kinds of services which they could not afford alone. Adminis- 
trators in small secondary schools will find that complete youth 
personnel programs are beyond them when they consider only 
the community which they serve. When consideration is given 
to the combined resources of five, six, or seven schools, there 
are practical solutions to the problem. Sharing the services of 
clinical counselors is such a solution. When an administrator 
faces an in-service training program for teacher-counselors 
with no professional assistance, he is involved in difficulties. 
When he faces in-service training of teacher-counselors as part 
of a county or district project in charge of a competent in- 
structor, many of these difficulties disappear. 

Few small communities can supply enough personnel work 
with youth to occupy the full time of a clinical counselor who 
concentrates upon vocational-educational problems. Part-time 
aid from such a counselor will, in many instances, be all that is 
needed to develop the school-community personnel program. 
A qualified counselor can discharge personnel functions in sev- 
eral schools. For example, research on problems common to 
several schools may be conducted alntost as easily as for a 
single institution. In-service training in a district may be little 
more difficult than in a single institution. Counseling of diffi- 
cult cases in a district may involve no greater case load than 
would be true in a single large institution or community. De- 
velopment of several sound school-community programs at 
one time in cooperation with other personnel agencies is not an 
unreasonable task. 


255 








EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


We have had time to discover the gross errors in assigning 
major counseling responsibilities to teachers and vocational 
specialists. We have not yet had time to correct these errors. 
Personnel work with youth in the post-war period will go for- 
ward, although there is no guarantee that tie public schools 
will retain the golden opportunity they have had to develop 
the field. If secondary school administrators will forget tradi- 
tion and face the tasks ahead realistically, if they will turn 
from subject matter and “things for people to do,” if they 
will make use of the best personnel workers, they may yet 
retrieve the losses in public support and esteem which they 
suffered in the depression and war years. Much depends upon 
the level of counseling effectiveness which the schools achieve. 


256 


*° et 2 2 fee oe 





ers mee 








A STUDY OF SOME LOCAL FACTORS AFFECTING 
STUDENTS’ SCORES ON THE MINNESOTA 
PERSONALITY SCALE 


BETTY M. HORNE anp W. C. McCALL 


University of South Carolina 


FRESHMAN TESTING program including tests of 

general scholastic aptitude as well as tests of achieve- 
ment and aptitude in specific subject fields has become com- 
mon practice in our colleges and universities. The results of 
these tests are customarily used in counseling with the student 
on academic and vocational problems, and in predicting the 
student’s probable academic success. 

Many institutions also include in their testing program 
instruments which ,are designed to measure the student’s vo- 
cational interests and aptitudes and scales designed to evalu- 
ate his personality adjustment. Both the felt need for infor- 
mation of this nature as an aid to more effective guidance 
procedures and the increased reliability and validity of recent 
measuring instruments have undoubtedly influenced this trend. 

However, the effect of local factors on all test scores and 
more particularly on tests of personality has long been recog- 
nized. The present study reports a systematic attempt to ana- 
lyze these factors in a specific situation. In September, 1941, 
the Minnesota Personality Scale was added to the regular 
list of freshman tests at the University of South Carolina. 
The test was thus administered to 241 freshman men and 
144 freshman women. 

Two different forms of the tests have been published, one 
for men and one for women. The total scale consists of five 
sub-scales measuring, respectively, Morale, Social Adjustment, 


257 








EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Family Relations, Emotionality, and Economic Conservatism. 
The authors of the test, Darley and McNamara, describe 
these sub-scales as follows: 


Part I—Morale: High scores are indicative of belief in 
society’s institutions and future possibilities. Low scores 
usually indicate cynicism or lack of hope in the future. 
Part II—Social Adjustment: High scores tend to be char- 
acteristic of the gregarious, socially mature individual in 
relations with other people. Low scores are characteristic 
of the socially inept or undersocialized individual. 

Part I[I—Family Relations: High scores usually signify 
friendly and healthy parent-child relations. Low scores 
suggest conflicts or maladjustments in parent-child re- 
lations. 

Part 1V—Emotionality: High scores are representative of 
emotionally stable and self-possessed individuals. Low 
scores may result from anxiety states or over-reactive 
tendencies. 

Part V — Economic Conservatism: High scores indicate 
conservative economic attitudes. Low scores reveal a 
tendency toward liberal or radical points of view on cur- 
rent economic and industrial problems. 


As is customary at the University of South Carolina, 
norms for the local population were set up and have been 
used in rating all students. A comparison of these norms 
with those published for the University of Minnesota popula- 
tion on which the test was standardized comprise the first re- 
sults of this study, and are presented in Table 1. 


Comparison of Minnesota and South Carolina Scores 


Since the Minnesota norms are stated in terms of per- 
centile values, it was necessary to base all statistical calcula- 
tions on these values. Accordingly, the critical ratios are ex- 
pressed in terms of the difference between fiftieth percentile 
points divided by the P.E. of the difference between these me- 
dians and must equal 4.0 or more in order to be statistically 
significant. Since there are different forms of the test for 
men and women, the critical ratios for the two sexes have been 
calculated separately. 


258 





; 





——— 








LOCAL FACTORS AFFECTING SCORES 


TABLE 1 


COMPARISON OF SCORES OF MINNESOTA STUDENTS AND SOUTH CARO- 
LINA STUDENTS ON THE MINNESOTA PERSONALITY SCALE 








~ Sub-Scales 


Median Raw Score I II Ill IV V 
Minn. men (N — 1083). 167 224 138 159 106 
S. C. men (N = 241)... 172 230 149 163 105 
Critical ratio® ......... 6.5 ae? 13 3.6 1.9 


Minn. women (N = 888) 173 228 149 168 104 
S. C. women (N = 144) 178 237 158 170 104 
Critical ratio® ......... 5.2 4.7%* 78 1,1 0 














*Critical ratio = Difference between medians divided by probable error of 
this difference. 

**Critical ratio of difference between Minnesota median and South Carolina 
mean for women =2.1; for men = 2.2. 

The table indicates two areas in which the South Carolina 
students appear to be better adjusted than the Minnesota stu- 
dents, Morale and Family Relations. The critical ratios for 
these scales are significant for both men and women. Data to 
be presented below indicate that the latter difference, that in 
Family Relations, is probably related to the relatively smaller 
number of large centers of population in South Carolina than 
in Minnesota. 

The explanation for the difference in general morale is 
less apparent. The items in this sub-scale may very roughly 
be divided into three main groups, namely, questions concern- 
ing faith in the honesty and adequacy of our legal system, 
questions dealing with faith in the value and methods of our 
educational system, and faith in the. possibilities which the 
future holds for the individual. It must be emphasized that 
this division is both arbitrary and rough, and since no data 
were available for an item analysis, no comparisons within 
this sub-scale are possible. A generalized statement based 
on the authors’ description of the area of personality adjust- 
ment measured by this scale would indicate that the South 
Carolina students had significantly more faith in society and 


259 


EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


its institutions and in their own future than did the Minnesota 
students. 

Scores of entering South Carolina freshmen on the Test 
of General Proficiency in the Field of the Social Studies of the 
Cooperative General Achievement Tests indicate that their 
acquaintance with the strengths and weaknesses of our Ameri- 
can social institutions is more limited than that of the 6296 
freshmen on whom the test was standardized. The South 
Carolina mean fell at the thirty-fifth percentile point for the 
standardization group. It is possible that this lack of ac- 
quaintance has tended to promote an uncritical acceptance of 
these institutions. It is not improbable that this difference, 
also, is related to the factor of population distribution, al- 
though the subsequent data do not strongly suggest such a 
conclusion. 

The scores on Scale II would suggest that the South Caro- 
lina students are more interested in and better adjusted to 
social group life. The nature of the questions on this scale 
strongly suggests that it measures a factor closely resembling 
the usual definitions of introversion-extroversion. Although 
the foregoing statement may be interpreted as lending support 
to the tradition of hospitality and sociability of the Southern 
home, the statistics also offer an alternative explanation. 

The distributions of scores on this particular sub-scale 
seem to be somewhat skewed, since the median women’s score 
is 5 points higher than the mean and the median men’s score 
is 2 points higher than the mean. The discrepancy of 5 points 
in the women’s distribution is the greatest difference between 
median and mean in any of the 10 distributions, the two dif- 
ferences on Scale III being 2.5 points, and all others being 
2 points or less. It is the only instance in which such dis- 
crepancy affects the significance of the difference between the 
scores of Minnesota and South Carolina students. As the 
footnote to the table indicates, the difference between the 
Minnesota median and the South Carolina mean for women 
on Scale II, divided by the probable error of the difference 
between the two medians, is only 2.1, a critical ratio which 


260 


ag ee 


a 





tg en me me 





at a. Le ion Be 


<a> te 





ng ener 





~ 








LOCAL FACTORS AFFECTING SCORES 


is not significant; the corresponding ratio for men is 2.2. The 
reader may choose the explanation of these data which seems 
to him most logical and acceptable. 


Relation of Scores to Size of Home Town 

We have referred above to an analysis of the data from 
South Carolina students based on size of population centers. 
The University is located in Columbia, the capital of the 
state and a city of about 75,000. Thirty-five to forty per 
cent of the students are from Columbia. The comparison of 
adjustment scores for students from population centers of dif- 
ferent sizes was originally suggested by a clinical observation 
that there seemed to be a disproportionate number of students 
from Columbia who showed one single low score on Scale 
III, Family Relations. 

The complete set of data is presented in Table 2. The 
students were divided into three groups, according to their 
home residence and the size of the town in which they at- 
tended high school. Class A includes all students who re- 
sided in and attended high school in cities of 25,000 and 
over; Class B includes those who resided in and attended 
high school in towns of 2500 to 25,000; and Class C those 


TABLE 2 


COMPARISON OF SCORES OF SOUTH CAROLINA STUDENTS FROM TOWNS 
OF CLASS A, CLASS B, AND CLASS c* 


Sub-Scales 











Mean T-score value I Il III IV V 
Class A (N = 183)..... 49 47 44 48 47 
Class C (N —80)...... 51 50 47 48 49 
Class B (N = 91)...... 52 56 55 54 52 
Critical ratios** 

Class A vs. Class B..... 1.0 3.6 4.3 2.3 2.1 
Class B vs. Class C..... 0.3 22 2.8 2.1 1.1 
Class A vs. Class C..... 0.6 Li 0.9 0.2 0.6 








*Class A—Towns with a population of 25,000 and over. 
Class B—Towns with population of 2500 to 25,000. 
Class C—Towns with population of less than 2500 and rural districts. 
**Critical ratio — Difference between means divided by the standard error 
of this difference. 


261 









EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


who attended high school in a town of 2500 or less and re- 
sided in a town of that size or gave a rural address. 

It should be noted that Class B was originally divided 
into groups from towns of 2500 to 10,000 and from towns 
of 10,000 to 25,000. Differences in average score between 
these two groups were generally small, the group from towns 
of 10,000 to 25,000 being in most instances somewhat lower. 
However, the number of students falling in the group from 
10,000 to 25,000 was so small as to make any real conclusions 
impossible. ‘The present Class B is composed of 68 students 
from towns of 2500 to 10,000 and 23 students from towns 
of 10,000 to 25,000. 

TABLE 3 
MEASURES OF VARIABILITY FOR AVERAGES PRESENTED IN 
__TABLES | AND 2 





_ Sub-Scales _ 





Median Raw 


Score + P.E. I II Ill IV Vv 
Minn. men 

CH==1083) ..065: 167+9 224+21.5 138+-12.5 159+12 106+8 
Ss. C. men 

(N=241) ...... 172+8.5 230+-20 149-+-10.5 163+12.5 105-+-5.5 
Minn. women 

(N=888)..... .. 173 +9 228 +-18 149-+-14 168-+-16 104-+-6 
S. C. women 

(N=144) .. 178-+-8.5 237+-17 158+-9.5 170+16.5 104+-5 


Mean T-score +- S.D. 
Class A (N=183). 49 
Class C (N=80)... 51 
Class B (N=91).. 52 


21 47422 0 4944421 48420 47+19 
22 50+19 47420 48+19 49420 
18 56+19 SS+19 S4+19 52+16 





~ An explanation of certain statistical techniques used in 
the calculations of Table 2 is necessary. All individual scores 
for South Carolina freshmen were translated into T-score 
values determined from the means and standard deviations 
of the distributions of raw scores. The scores presented are, 
therefore, average T-score values, not average raw score 
values. The use of these T-score values provides a basis on 
which scores for men and women may be combined for statis- 
tical treatment. The T-scores for each sex are based on the 
scores of that sex, but a given T-score value for men and 
women is considered comparable. 

For convenience in reading, Class C is shown between 


Class A and Class B in Table 2. The evidence provided by 


262 





=] 


mm ePr A. B.. 


ef ss Db 


are ese ,_—_-> Pp Qa Fo ew i— 


~~ 








RD mw wD fy 


2 BO Doe 





LOCAL FACTORS AFFECTING SCORES 


the table is both striking and self-explanatory. On every scale 
the average scores of students gradually increased from Class 
A through Class C to Class B. At least for South Carolina 
high school students the environment most conducive to per- 
sonality adjustment seems to be a Class B community; that is, 
a town from 2500 to 25,000. Very small towns or rural dis- 
tricts appear somewhat more favorable than metropolitan 
centers. 

Critical ratios presented in this table are based on the 
difference between means divided by the standard error of this 
difference, and are thus significant at 3.0. The differences 
between Class A and Class B students on Scales II and III, 
Social Adjustment and Family Relations, meet this criterion; 
and the differences between Class B and Class C show 98.6 
and 99.7 chances in 100, respectively, of a true difference on 
these same scales. 

It seems safe to conclude, therefore, that towns of medium 
size provide a significantly better background for the develop- 
ment of social maturity and extrovertive social relationships 
than either a city or a very small community. Speculation 
concerning the complex factors operating to produce these 
differences would be interesting but highly subjective. 

At least one factor affecting the home and family adjust- 
ments would seem, however, to be less complicated. One 
large group of questions contained in this sub-scale relates 
to possible maladjustments arising from the young person’s 
efforts to establish his social and personal independence. It 
seems logical to suppose that these areas of family relation- 
ship are subject to greater strain in either a metropolitan cen- 
ter or a small town than in a medium-sized urban community. 
The “temptations” of city life have been discussed perhaps 
far too much in our recent sociological literature, but the op- 
portunity and motivation toward social independence offered 
by the recreational, social, and even school-sponsored activi- 
ties of a city are too obvious to require elaboration. Any 
effort of the parents to counteract these influences will almost 
inevitably lead to family conflict. 


263 








EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


In the rural home, on the other hand, it seems quite prob- 
able that the source of conflict lies in the moral conservatism 
which is generally assumed to be characteristic of farm par- 
ents. We are here assuming, of course, that towns under 
2500 resemble rural areas in their mores and attitudes, an 
assumption which is probably justified. If this is the case, 
rural and small town students undoubtedly find themselves in 
conflict with their parents over proposed activities which would 
receive no frown of disapproval from the city parent. 

The authors are aware that the foregoing paragraphs of 
interpretation involve several assumptions with which the 
reader may disagree. The data are clear-cut in their exposi- 
tion of the facts; the interpretation of these facts must of 
necessity be somewhat subjective. 

On a third sub-scale, number IV, Emotionality, the data 
indicate differences approaching statistical significance between 
students of Class A and Class B, and between Class B and 
Class C. These differences are probably related to the cor- 
relations presented in Table 4. Since the only correlations in 
this table above .50 are those between Scales II and IV and 
Scales III and IV, at least some of the factors which influence 
Scales II and III must influence Scale IV in a similar direction. 
It appears probable that the differences on Scale IV are depen- 
dent on these relationships to some extent. 


Inter-Relationships Between Scales 


The data of Table 4 are of special interest on two points. 
The similarities of the correlations based on the scores of 
Minnesota students and of South Carolina students are strik- 
ing. The critical ratios of the differences of these various 
correlations were calculated, and of the 20 comparisons thus 
made, only one critical ratio, that between Scales II and IV 
for men, was above 2.0, and this one did not reach significance. 
As the table indicates, the range of average correlations for 
the four groups studied is only .02. 

The second point of interest concerns the relationships 
of Scale IV to other scales in the test. This sub-scale meas- 


264 





ro Clim 











LOCAL FACTORS AFFECTING SCORES 


TABLE 4 


INTER-CORRELATIONS OF THE FIVE SUB-SCALES OF THE 
MINNESOTA PERSONALITY SCALE 











: : 
at ag g |i 2] 
a EN aH oZ 
2 || 3 || =~ a 
gc sé =§ Os 
ss 0 Sg ae 

Correlation 2 2 = 5 
Scale I with II...... 41 iat 36 31 
I wah ETl...... 26 33 34 .26 
E wan 5M ix... 38 36 38 34 
S 3 eee 21 sae 18 18 
Scale II with III.... .25 36 .26 BY - 
with W.... SS 39 48 51 
Re We Wa 7 19 iS A ih 
Scale III with IV.... .52 56 54 58 
III with V..... 24 ‘a 16 mij 
Scale IV with V..... an 13 15 .28 
pe ee 32 32 30 31 


ures emotional stability, and includes many questions usually 
found in an inventory of neurotic traits. The fact that it re- 
flects and is reflected in other areas of personal adjustment is, 
therefore, quite in harmony with repeated observations of 
clinical psychologists. As has been pointed out, the only cor- 
relations above .50 in the table involve this sub-scale. The 
average inter-correlation of this scale with all other scales, for 
all groups involved, is .40. The average inter-correlations 
of the other scales, exclusive of their correlation with Scale 
IV, are .29, .27, .27, and .18, respectively, for Scales I, II, 
III, and V. ah 


Summary and Conclusions 


The Minnesota Personality Scales for men and women 
have been administered to 241 freshman men and 144 fresh- 
man women at the University of South Carolina. The scores 
of these students have been compared with the norms data 
published for the scale, based on the scores of 1,083 fresh- 


265 














EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


man men and 888 freshman women at the University of Min- 

nesota. The total scale consists of five sub-scales which meas- 

ure Morale, Social Adjustment, Family Relations, Emotional- 
ity, and Economic Conservatism. 

The data herein presented support the following con- 
clusions: 

(1) South Carolina students, both men and women, ob- 
tained scores indicating a significantly better adjustment in 
Morale and in Family Relations than those of the Minne- 
sota students. There is some evidence that the South Caro- 
lina students are superior in Social Adjustment, though this 
conclusion is not clearly substantiated. 

(2) South Carolina students from towns of 2500 to 25,- 
000 population (Class B) appear somewhat better adjusted 
on all scales than students of towns of 2500 or less (Class 
C), and the latter slightly better adjusted than students from 
cities of 25,000 and over (Class A). These differences reach 
significance between Class A and Class B in Social Adjustment 
and in Family Relations, and approach significance between 
Class B and Class C on the same scales. They approach sig- 
nificance between Class A and Class B and between Class B 
and Class C in Emotionality. 

(3) Inter-correlations between the several sub-scales 
based on data from Minnesota students and from South Caro- 
lina students are strikingly similar. The scale measuring Emo- 
tionality shows the only inter-correlations above .50, and 
shows a higher average inter-correlation than any other sub- 
scale. Correlations between Emotionality and Social Adjust- 
ment and between Emotionality and Family Relations are 
above .50. 

REFERENCES 

1. Darley, John G., and McNamara, Walter J. Minnesota Personality 
Scale, Manual of Directions. New York: The Psychological Cor- 
poration, 1941. 

2. Darley, John G., and McNamara, Walter J. Minnesota Person- 
ality Scale (For Men). New York: The Psychological Corporation, 
1941. 

3. Darley, John G., and McNamara, Walter J. Minnesota Person- 
ality Scale (For Women). New York: The Psychological Cor- 
poration, 1941. 

4. Willis, Mary, et al. Cooperative General Achievement Tests, Num- 
ber I. A Test of General Proficiency in the Field of the Social 
Studies, Form QR. New York: The Cooperative Test Service, 1940. 


266 











bes 





THE PLACE OF APTITUDE TESTING IN 
THE PUBLIC SCHOOLS 


DONALD E. SUPER 
Clark University 


N THE PRACTICE of aptitude testing, three basic 

assumptions are important. These assumptions have been 
so well established by research in the psychological labora- 
tories, in the schools, and in industry that they are now gen- 
erally taken for granted and need little justification. 

One assumption is that individuals differ in the extent to 
which they possess any given aptitude, some being well en- 
dowed with the aptitude, let us say, to sing, others having 
little aptitude for vocal music, and most of us being potentially 
only mediocre singers. 

The second assumption is that there are a number of 
special aptitudes, such as aptitude for musical expression, apti- 
tude for mechanical work, aptitude for visualizing the rela- 
tions of objects in space, scholastic aptitude, manual dexterity, 
and aesthetic judgment. 

The third assumption is that there are important differ- 
ences in the amounts of these various aptitudes possessed by 
a given individual. 

Dr. Walter Dill Scott, pioneer industrial psychologist and 
until recently president of Northwestern University, has an 
interesting story illustrating this point. According to personnel 
data compiled by him, the most successful salesman in a whole- 
sale food company was also its least intelligent salesman. 
Unable to reconcile these two items of information, Dr. Scott 
investigated further, found that this was indeed the case, and 
sought an explanation. He found that the salesman would go 
into a delicatessen, let us say, and chat with the owner and 
his wife. The conversation generally dealt with family and 
neighborhood affairs, about which the salesman kept posted. 
Finally he would get around to pickles and other items of 


267 











EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


business. Then another man would enter the picture, a second 
salesman employed to work with him, who discussed prices, 
took orders, filled out blanks, and performed other clerical 
tasks which the star salesman could not handle. It actually 
paid the company to employ two men to do one man’s work! 
Such extreme variations of abilities in one individual are the 
exception, as Dr. Terman demonstrated in his “‘Genetic Studies 
of Genius,” but the extreme case illustrates a less extreme 
tendency toward stabilization of aptitudes and abilities with- 
in individuals. 

These three assumptions have provided us with a basic 
philosophy of education and of guidance, together with a 
working program for the schools. Recognizing the potential 
worth of each individual, it becomes incumbent upon us, as 
members of a democracy, to provide for individual differences 
in the children with whom we work. It is also important, if 
we are to make our democratic system effective, to study the 
individual differences in our pupils and to help them under- 
stand their own abilities and interests, in order that they may 
choose wisely from among the various educational offerings 
provided. It is not enough to develop differentiated curricula, 
as will shortly be demonstrated, unless we also provide the 
means of making wise choices of curricula. It is at this point, 
of course, that aptitude tests enter the picture. 

Before proceeding to discuss these last in some detail, let 
us dwell briefly on each of these two aspects of the working 
program of a democratic educational system, curricular differ- 
entiation and individual analysis, imposed upon us by our 
recognition of individual differences, special abilities, and trait 
differences. 

The history of American secondary education is in effect 
the history of a long drawn-out and not altogether conscious 
attempt to provide for individual differences. The colonial 
Latin Grammar School existed to provide pre-professional edu- 
cation, to prepare for college boys who were to enter the 
learned professions. It was largely supplanted by the Acad- 


268 











ond 
ces, 
ical 
lly 
rk! 
the 
lies 
me 


th- 


——— 


APTITUDE TESTING IN PUBLIC SCHOOLS 


emy, the purpose of which was to add two new types of edu- 
cation for two new types of pupils. It offered scientific and 
commercial training for those who were planning to enter 
technical occupations and the field of business, in addition to 
academic courses. The public high school entered the picture 
in the last century in order more effectively to provide these 
same types of education. Its purpose was to offer pre-profes- 
sional and pre-commercial training, as an analysis of its sub- 
jects and of the then current vocational conditions would show. 
In more recent years the Industrial Arts course and the Trade 
School have been developed to meet the needs of those who 
are likely to enter the skilled trades. Some of the recent evalua- 
tive studies of public education, such as the Regents’ Inquiry 
in New York State, now advocate the development of a fourth 
type of secondary education, a high school with a curriculum 
designed to prepare youth for work in the semi-skilled trades 
and for the patterns of living typical of those employed at 
that level. A few schools already provide such courses. 

If we analyze the trends so briefly described above, we see 
at once the increasing differentiation of our educational offer- 
ings as a result of the recognition of individual differences 
and the demand for appropriate curricula. 

One might expect, once a reasonable variety of educational 
offerings is provided, that the distributive mechanism of a 
democratic educational system would function smoothly. Pupils 
and their parents could look over the offerings and choose 
appropriate courses, especially if given the benefit of the 
advice of teachers and principals who are familiar with both 
the children and the courses. The practices of many schools 
have been based on this assumption. Recent years, however, 
have shown that this is unwarranted, for numerous thorough 
studies have indicated large numbers of young persons obtain 
an education of a type not suited to their vocational prospects. 

This statement should be illustrated with concrete facts, 
for such claims are all too frequently made without adequate 
foundation. Two-thirds of our youth of high school age 


269 











EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


attend high school, that is, are in schools which prepare for 
professional, commercial, or skilled employment. But only 
one-fourth of our employed adult population is actually en- 
gaged in occupations of these types. This means that five- 
twelfths, or approximately one-half, of the young people whom 
we are attempting to educate are actually being prepared for 
vocations and for ways of life which they will not enter. To 
put this in another way, our young people now tend to get 
an education planned in terms of the upper half of the occu- 
pational and social scale, whereas most of them enter and 
remain in occupations in the lower half of the scale. Surely 
no further proof is needed that young people need vocational 
guidance, that is, help in understanding and acting on their 
own abilities, interests, and opportunities. 

Given several different types of secondary school curricula, 
and granted the need to help students decide which type of 
curriculum to pursue, we must then devise methods of in- 
dividual analysis which will assist them in making curricular 
and consequent vocational choices. 

Various methods suggest themselves. We may examine a 
student’s marks in the different subjects which he has studied 
and find out in which types he has done the best work. But 
this approach has at least two important limitations: the 
courses which he has taken have been limited in variety and 
in number, and teachers’ marks are frequently unreliable and 
invalid indices of the quality of the work done. Useful as 
such data are in understanding a pupil, they cannot in and of 
themselves suffice for the task at hand. 

We may keep cumulative records in which are noted not 
only the pupil’s marks, but his extra-curricular activities, his 
relations with his fellows, his special interests and out-of- 
school activities. These can, as we know from their wide use, 
be very helpful in understanding a child. But, again, the ex- 
periences are likely to be limited in variety (a defect which 
can be at least partly overcome) and the evaluations made 
of these experiences are necessarily subjective. They fre- 


270 








— FF — 


oo” 





k 
' 
) 
' 
' 
} 





APTITUDE TESTING IN PUBLIC SCHOOLS 


quently do not permit comparisons with other persons, and 
their real significance for curriculum and for vocations is too 
often not clear. 

Perhaps a digression is desirable to illustrate this last 
point, namely, the doubtful nature of some of the relationships 
which we think we see between hobbies and school subjects or 
vocations. It is widely assumed among philatelists that as a 
result of collecting stamps they learn a good deal more con- 
cerning history, geography, and related subjects than they 
otherwise would. To check up on this assumption, a series of 
studies of adolescent and adult stamp collectors were made. 
They were given tests of intelligence, of achievement in the 
social studies, and of technical philatelic matters. The same 
tests were given to a control group of non-philatelists. We 
found that the adult stamp collectors had learned a great deal 
more about stamp collecting, a little more about strictly 
factual aspects of geography (such as names of capitals), and 
nothing more about significant social problems. The adolescent 
stamp collectors learned nothing but the technology of phil- 
ately from their hobby. These and other studies suggest that 
information concerning a pupil’s activities must be used with 
caution as an index of aptitude, of achievement, or of interest 
in supposedly related fields. 


It should be clear that aptitude tests are needed as a sup- 
plement to these other not too effective methods of analyzing 
human abilities, interests, and achievements. They are needed 
because, when well constructed and wisely used, they are ob- 
jective, because they make possible comparisons between peo- 
ple, and because their curricular and vocational significance 
can be established with relative ease. These three concepts 
of quantification, reliability, and validity are now so generally 
familiar that they need not be elaborated. 

We may ask next: What aptitude tests should one use 
in a school, and at what age should they be used? Before an- 
swering that question we must find the answer to another: 
What do we want to measure, and when must we measure it? 


271 








EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


To reply to these questions in general terms at first, we 
want to measure those characteristics which are important in 
success at a given stage of a child’s educational or vocational 
career some time before he enters that stage, in order that he 
may plan for it with wisdom. This means that different types 
of traits and abilities may well be measured at different ages 
and stages, as life’s decisions make those data desirable. 

What are these stages when decisions are being made? 
One of the first, obviously, is when the pupil leaves the ele- 
mentary or junior high school to enter high school. Another 
is when he leaves high school to enter a vocation or a college. 
Still another is when he leaves college to enter an occupation. 
If our schools are in fact pre-vocational, the data needed at 
one stage are substantially the same as those needed at 
another. 

When a student leaves the lower school to enter high 
school, he has to make decisions concerning the type of high 
school and the type of curriculum, concerning the specific 
course within the curriculum, and, since the curricula are in a 
sense pre-vocational, concerning the general family of occupa- 
tions which he wishes to enter. These decisions must be based 
on the abilities of the pupil as they relate to the requirements 
of the courses. Tests should therefore be selected so that they 
will tap the various aptitudes, interests, and achievements 
which make for success and satisfaction in those courses. 


To profit from the college preparatory course a pupil 
should have high average or superior mental ability, for much 
of the content of the course is abstract; an extensive vocabu- 
lary, since the subject matter is contained in books and since 
its exercises are generally verbal; superior reading speed and 
comprehension, for the same reasons; ability to work with 
numbers, since numerical symbolism and manipulation are im- 
portant in many subjects, especially for the future technologist ; 
interest in the why, how, and whence of things and of ideas, 
since of such is the content of most academic subjects and the 
basis for most of the professions. To the prospective college 


272 





APTITUDE TESTING IN PUBLIC SCHOOLS 


preparatory or academic student one would, therefore, want 
to give tests of scholastic aptitude, vocabulary, reading, math- 
ematics and other academic subjects, and interests. 

Are such tests available for people as young as 14 or 15? 
Intelligence or scholastic aptitude tests have, of course, long 
been in use at all ages. Achievement tests in the tool subjects 
are equally well standardized and validated. Both can be given 
by teachers with a minimum of training in test techniques and 
can be scored for relatively little money. Interest tests are 
not so well developed at this level, but there are at least two, 
and probably three or four, which can be given to early adoles- 
cents with some confidence if the results are to be used by 
competent people. Scoring may run into more money, but the 
cost is not necessarily prohibitive. 

Singularly little attention has been given in most localities 
to the qualities needed for success in the commercial course, 
although this has to some extent been remedied in more recent 
years. Again, intelligence or scholastic aptitude plays a part, 
although it need not be present in the same degree as in the 
college preparatory course. A good deal of subsequent dis- 
appointment would probably be avoided if most pupils with 
1.Q.’s of less than 105 or 110 could be motivated to choose 
courses other than the academic, and if most of those with 
1.Q.’s of less than 95 or 100 could similarly be guided (but not 
coerced) into courses other than the commercial. A mod- 
erately good vocabulary, reading ability, and mathematical 
achievement are required here too, although the minimum re- 
quirement is somewhat lower than that of the academic course. 
The interests of commercial employees are different from those 
of professional people not in degree, as has been true of their 
abilities, but in kind. To them questions such as why, how, 
and whence are less important. To be specific, they are more 
interested in people as friends, figureheads or freaks than as 
organisms motivated by needs and drives and acted upon by 
forces in a human and material environment; a mountain is 
something to admire, to picnic on or to take a picture of rather 


273 














AND PSYCHOLOGICAL MEASUREMENT 





EDUCATIONAL 


than to analyze as a manifestation of ancient geological go- 
ings-on. Two types of special aptitude are needed for success 
in clerical work, the abilities to recognize verbal and numerical 
symbols with speed and accuracy. In addition, sales people 
need certain personality traits which enable them to make ef- 
fective contact with customers. 

The same tests that are used with the prospective academic 
pupils can be used with those who are considering commercial 
courses. The two special clerical aptitudes can be measured 
by well-proved clerical aptitude tests, even at the fourteen- 
year-old level. Personality cannot so well be measured; tests 
and inventories are available, some of which have their uses, 
but it will be some time yet before they are worth the cost 
for purposes such as those now being considered. 


For the trade courses, as for the pre-professional and 
commercial, a minimum of abstract mental ability is required, 
but if we may judge by the evidence available from trade 
schools and from industrial research, the minimum for most 
trades is lower than for the other two groups. Apparently 
an I.Q. exceeding 85 or 90 is generally sufficient to enable one 
to master the arithmetic and other school subjects needed in 
most skilled trades, although some rate considerably higher 
than most routine office jobs. Given this, certain special apti- 
tudes assume primary importance, the specific aptitudes and 
the amounts of those needed varying somewhat from trade 
to trade. More than average manual dexterity is not needed 
in most skilled trades, but for those in which it is required, 
it can be tested. What appears to be most important both in 
learning a trade and in practicing it is mechanical aptitude or 
insight, a special ability which is independent of scholastic 
aptitude. It can be measured fairly well by means of several 
paper and pencil tests and by performance tests. Of equal 
importance (and probably underlying the former) is ability to 
visualize spatial relations; that is, to judge the relationships 
of shapes and sizes in work such as machine shop and drafting. 
This ability can be measured by good group and individual 


274 








— ss 


APTITUDE TESTING IN PUBLIC SCHOOLS 


tests. Finally there is interest, which must be considered in 
this as in other fields. The interests of persons in trade schools 
and in the skilled trades tend to resemble those of people in 
technical schools and in scientific occupations, but on a lower 
mental level. They like the concrete and the practical; they 
prefer to work with objects which they can manipulate and 
transform rather than with abstract problems, with records, 
or with people. These interests, and the aptitudes mentioned 
above, can be measured fairly satisfactorily in early ado- 
lescence. 

We will in time pay more attention to aptitudes needed 
for education for the semi-skilled and unskilled occupations. 
Again, minimum and maximum levels of mental ability will 
need to be taken into account, and these will be lower than 
for the other types of curricula. Achievement in the tool sub- 
jects of the school will have less vocational importance. 
Mechanical aptitude will not play a prominent part. Manual 
dexterity and ability to visualize spatial relations will vary 
considerably from one kind of job to another. Physical 
strength and stamina will play more part in unskilled work, less 
in semi-skilled. The interests of people who enter these occu- 
pations, if we may judge by the not too adequate data now 
available, are not clearly differentiated. Apparently they have 
little in the way of special educational or occupational interests. 
We must study them more intensively in order to find out 
what does really challenge and appeal to them if we are to 
devise satisfactory curricula for these groups. As we learn 
more about them we will develop more adequate tests for 
working with them, especially in the field of interests. The 
other characteristics can be measured reasonably well at 
present. 

A very important objection is not infrequently raised at 
this point. Assuming that we use these tests and obtain such 
information about our pupils, how are we going to get them 
to act upon it? Are we going to tell them what they can and 
what they cannot do? Are we going still further and tell them 
what they may and may not do? Is such action in line with 


275 








EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


the democratic philosophy which is one of our basic assump- 
tions? 

The answer lies in pointing out that tests can be used for 
either of two purposes: guidance or selection. The necessity 
for questions such as the above arises from a confusion of the 
two, and a consequent misunderstanding of the former. Guid- 
ance, or counseling, consists of helping a person to gain insight, 
to develop self-understanding. Selection involves choosing 
those who have the desired characteristics and offering them 
the opportunity in question. In a democratic society we must 
do both, but the processes must not be confused. 


The vocational and educational counselor has, as his func- 
tion, helping youth to obtain experiences which will give them 
insight into their abilities and interests. Taking aptitude tests 
is one such experience. The counselor is concerned also with 
helping him to evaluate these experiences. Discussing the test 
results and their significance, as shown in the experiences of 
others who have made similar scores, is the way in which the 
results of the test experience are evaluated. The counselor 
does not say that you can, or you cannot, do thus and so; he 
shares with the youth information to the effect that such and 
such a percentage of other youth who made scores comparable 
to his did or did not do thus and so. The significance of 
these experience tables must then be discussed, and the youth 
must make his decision on the basis of the insight thus gained. 

The school using aptitude tests for selection is faced with 
another problem. It, too, has experience tables, to use the 
insurance term, or norms, to use the educational. Having 
tested an applicant, it says to him: “You have characteristics 
which suggest that you will be successful in this line of train- 
ing and work: we will admit you as a student’’; or it says: 
“Experience shows that most students with your characteristics 
do not complete our course, so we do not feel justified in 
investing time and money in giving you this training.”’ Thus, 
society protects itself and its resources, and experience teaches 
the individual to make a wiser choice. 


276 














APTITUDE TESTING IN PUBLIC SCHOOLS 

Such uses of tests imply the existence of two basic condi- 
tions: tests which are thoroughly standardized, and test users 
who know their tools. To administer and to score most tests 
is relatively easy. To interpret them wisely requires great 
skill, considerably specialized knowledge, and profound wis- 
dom ripened by experience. 

A few brief words of summary may be helpful in closing. 
The place of aptitude testing in the public schools is the place 
at which choices need to be made. It is the place at which 
objective data are needed to provide a basis for those choices. 
And it is the place at which a trained, skillful, and wise coun- 
selor is available to assist in evaluating the data on which those 
choices should be based. 











EFFECT OF ENGINEER SCHOOL TRAINING ON 
THE SURFACE DEVELOPMENT TEST 


RUTH D. CHURCHILL, JEANNE M. CURTIS, CLYDE H. COOMBS, 
AND THOMAS W. HARRELL, Ist Lt., A.G.D. 


Personnel Procedures Section, The Adjutant General’s Office 


AUBION, CLEVELAND, AND HARRELL’ report 

that six weeks of “intensive training in mechanical courses 
does not significantly increase mechanical aptitude test scores, 
even where the test is very similar to the activities carried out 
in the training. This is strikingly true of the Surface Develop- 
ment Test, in which the items resemble mechanical drafting 
and blueprint reading work.” 

An analysis of the effects of nine weeks’ training at an 
Enlisted Men’s Engineer School gives contradictory results 
for the Surface Development Test. In this case, there are 
significant increases in scores on the second administration of 
the Surface Development Test after nine weeks’ training. 

The content of the Surface Development Test used in this 
study is similar to that of the one used in the previous study; 
it involves matching drawings in two dimensions and in three- 
dimensional perspective. 

Since the same form of the test was used both at the be- 
ginning and at the end of the training, the increases in scores 
may be attributed to two factors: practice effect and the actual 
training received in the course. The tests were not given to a 
control group which received no training, but it is possible to 
compare the increases in scores of two different classes at the 
school. Since the training received in the Drafting Class is 
closely associated with the problems of the Surface Develop- 





1R. W. Faubion, E. A. Cleveland, and T. W. Harrell, “The Influence of 
Training on Mechanical Aptitude Test Scores.” Educational and Psychological 
Measurement, II (1942), 91-94. 


279 














EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


ment Test, this class can be used as the experimental group. 
The instruction in the Water Purification Class covers the 
principles and applications of electricity and automotive 
mechanics as well as water purification. Presumably this mate- 
rial is little related to the abilities involved in the Surface 
Development Test, so that this class can be used as a control 
group. Table 1 compares these two classes with respect to 
their increases in scores after nine weeks’ training. 


TABLE 1 


MEAN SURFACE DEVELOPMENT TEST SCORES OF THE TWO CLASSES BEFORE 
AND AFTER NINE WEEKS’ TRAINING 











D 

No. Mean, Mean, D oD cD 
NE ai ck Wenicanawsause ee 66 71.89 93.12 21.23 1.66 12.79 
Water Purification ............. 60 52.48 64.18 11.70 1.45 8.07 
Drafting vs. Water Purification. . 19.41 28.94 9.53 2.20 4.32 


The gain made by the Drafting Class on the Surface De- 
velopment Test is significantly greater than that made by the 
Water Purification Class. Both classes also took tests of 
mechanical information and comprehension before and after 
the nine weeks’ training. The content of these tests is not re- 
lated to drafting, although it may be to water purification. 
The two classes made small but significant gains on both tests; 
the Water Purification Class gained more than the Drafting 
Class, significantly so on the mechanical information test. It 
may be inferred, therefore, that, on the Surface Development 
Test, the greater gain of the Drafting Class as compared with 
the Water Purification Class is a result of the content of the 
drafting course. 

The most probable explanation of the contradictory results 
of the two studies lies in the difference in the amount and 
intensity of the training which each group received. For the 
airplane mechanics, mechanical drafting and blueprint reading 
was only one out of five courses; over a period of six weeks, 
they received 40 hours’ instruction in this subject. The Draft- 
ing Class at the Engineer School, however, studied nothing 
but drafting and had almost 400 hours’ training in the various 
phases of that subject. 


280 








AN AID TO STUDENT COUNSELORS 


RALPH F. BERDIE 
University of Minnesota 


UCH TIME is spent in the counseling interview es- 

tablishing rapport between the interviewer and the 
student and diagnosing problems of varying complexity. Only 
after the counselor has obtained clues to and adequately diag- 
nosed the problems of the student may actual therapeutic 
work proceed. In searching for these clues the counselor often 
spends a great deal of time asking questions and persuading 
students to talk about their activities and past experiences. 
Poor achievement or general dissatisfaction on the part of the 
student may suggest to the counselor the existence of a prob- 
lem, but he must then determine if the student is worrying 
specifically about his health, his inability to get along with his 
father or his meager social life. When this has been done, 
the student can then be helped to do something about his 
problem. 

The student who comes to the counselor usually has a 
complaint. He comes because he is having difficulty in choos- 
ing a vocation or is failing his chemistry or is running out of 
money. These expressed problems demand the attention of 
the counselor and may provide a starting point for his inter- 
view. Most often these complaints are only symptomatic of 
other problems or else are generalized expressions of several 
other problems. A student claiming vocational indecision may - 
actually be suffering from lack of information regarding his 
own abilities and characteristics, a paucity of vocational in- 
formation and paternal pressure urging him toward a distaste- 


281 














EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


ful occupation. A student having trouble with his school work 
may actually be suffering from poor study habits, inadequate 
reading skills, and too much outside work. After recognizing 
a general problem the counselor must make a more specific 
diagnosis and then initiate treatment. 

Many students approach the counselor with a particular 
orientation dependent upon prevailing stereotypes associated 
with the counseling program. They come for vocational or 
educational advice without even considering that they may be 
able to receive help with some of their other problems. A 
student may come to the counselor for assistance with his 
study methods and never think that he might possibly learn 
how to handle an unpleasant family situation nor realize he 
should try to do anything about it. He has thought of the 
counselor as serving only one of the several purposes actually 
served by that counselor. 

To assist the counselor in his diagnosis and to suggest 
to the student the various functions of the counselor, a prob- 
lem check list has been developed at the Testing Bureau at 
the University of Minnesota and used successfully for over 
one year. The check list consists of thirty-three statements 
of various problems encountered frequently in student coun- 
seling. These problems were obtained from books on coun- 
seling (3), (4), and from a survey of case histories of stu- 
dents. The purpose of the list is to facilitate the interview 
processes and to assist the counselor in determining what prob- 
lems the student faces. It provides an opportunity for the 
counselor to approach problems that are often difficult to 
bring up in the interview and gives the student an opportunity 
to consider what he wants to talk about before the actual 
interview. 

A more extensive problem check list has been published 
by Moody (7). His longer list may prove more useful in 
counseling situations which do not provide a great deal of 
other information about students. Where much information 
is obtained through interviews, tests, and questionnaires, a 
shorter check list of problems is more economical and perhaps 


282 





ey 





AID TO STUDENT COUNSELORS 


more useful in the interview. Wrenn (5) has published a 
check list to help the counselor in diagnosing and treating 
problems centering around study habits. Symonds has also 
done extensive work involving a check list of problems ef 
adolescents and others (2). 

The problem check list was included in the individual 
record form used at the Testing Bureau and given to the stu- 
dents before the counseling interview. Directions to the stu- 
dent were ‘as follows: 

Everyone faces problems throughout his life. Some of 
these problems cannot be solved without help. Many times 
they are very easily solved. At other times they are solved 
only after much effort. Below are a list of problems with 
which young people are often concerned. After those prob- 
lems you have not been able to solve adequately, place a check 
(Vv). After those problems which you would like to discuss 
with a counselor, place a double check (VV). These will help 
us to be of greater assistance to you. 

The responses of 208 men students and 119 women stu- 
dents were tabulated to determine the number of students 
checking each problem. The number and percentage of men 
and women checking and double-checking each of the items 
are presented in Table 1. 

Over one-half of the men and women coming to the Test- 
ing Bureau desired to discuss what they were best able to do. 
Slightly less than one-half wanted to discuss what they would 
like to do. Students coming for counseling express great con- 
cern with their abilities and interests, as well they might. The 
two other items students most wished .to discuss were their 
study habits and the training requirements for their chosen 
occupations. Students and faculty members have tended to 
place great emphasis upon the educational and vocational 
services offered by the Testing Bureau, and problems in these 
areas are the ones which students are most ready to bring 
to the counselors. 

Comparison of the numbers of students checking and dou- 


283 











TABLE 1 


NUMBER AND PER CENT OF 208 MEN AND OF 119 WOMEN WHO PLACED SINGLE AND 
DOUBLE CHECKS OPPOSITE EACH OF THE PROBLEMS 


Men 
Single Double 
Check Check 
No. % No. % 
1. I usually feel inferior to my associates.. 30 14 10 5 
2. I have been unable to determine how 
much time I should study.............. 7 8S SA «689 
3. I have too few social contacts.......... | a: ee. 
4. I have difficulty in making friends...... se 7 2 @S 
5. I do not know how to obtain the money I 
RPC RIES EG eet Ew Oe eae eee “eo 7 5 
6. I have been unable to determine what I 
Oe Eig a Sunn adees ecb cw ees 50 24106 51 
7 do not know how to take good lecture 
— EEC TE CTE TT CET CTE STOTT O Te 58 28 29 14 
8. I do not get along well with my parents 9 4 0 0 
9. I often have difficulty in keeping friends $ 4 1 0.5 
10. I am unable to determine what I would 
DE cocececthaneaonnieteedh nasa 30 14 72 «($5 
11. I have not obtained parental approval! of 
my vocational plans............. a oo Se | Oe 
12. I do not have enough to talk about in 
DE Niaccen cate ane cde kane hannddene = @& 2 3 
13. I receive inadequate financial help from 
SA ae ere eee 7 ses aS 
14. I do not know how to outline text-book 
Er oh aidic nigel anran miaeins a6 a: ae a 
15. 1 am unable to get along with my 
brothers and/or sisters................. 3 1 1 0.5 
16. I have been unable to make a satisfac- 
tory religious adjustment............... Se F f+ 
17. I am not interested in my studies........ = & 2 Y 
18. I do not have enough information about 
job opportunities and duties..... ...... 24 12 27 13 
19. I am frequently embarrassed when with 
DN oNecamcadg mae aaeees kad dente ae 17 8 > 
20. I usually do not enjoy being with mem- 
bers of the a a ee 10 5 3 1 
21. I am unable to do my work well because 
of too many social activities.......... - *= 2 3 
22. I usually do not know how to act in com- 
ee nnn eS ae eee 10 5 0 0 
23. I usually cannot read fast enough to 
cover all of my assignments............ Ss ti mM 6S 
24+. I usually have difficulty understanding 
— ERIS ett ear ee so 3. # 4 
25. I do not know what the most appropriate 
training is for my chosen career...... 17 8 39 19 
26. I do not know if an education is worth 
MN nie wich Dinas wie RP aia a es @ecaseatw vgs $$ 2 6 3 
27. I feel guilty about something I have or 
RE I od waddanta ioe adenenuekuls mS 7 €¢-e3 
28. I have so much outside work to do that I 
am neglecting my school work.......... 3 1 3 1 
29. I have trouble making myself study.... 51 25 19 9 
eh ee MIND, ic kota nncccdcacae Ss 2 ¢ § 
31. I am dissatisfied with my state of health 12 6 2 1 
32. I do not know how to improve my per- 
IIE oi oes wddcnnaeetewr ese Ss 2 6 § 





I do not know how to break certain 
De re ee oe nr oP eee 





Women 
Single Double 
Check Check 

No. % No. % 
a a ee 
9 8 14 12 
i a 
5 4 3 

5 4 4 3 
17 14 57 48 
20 17 17 «14 
a» 3+ 4 @ 
1 1 1 1 
20 17 42 35 
- @ 2 2 
me 6S (CU 
. £ 2 2 
4 3 9 8 
3 3 0 0 
S$ $s 34 
1 1 x 3 
86 313 8 
10 8 a 2 
4 3 1 1 
4 3 1 1 
6868 6 
10 8 1 1 
12 10 1 1 
7 6 24 20 
2 2 8 6 
r @ ¥ 

1 1 1 1 
12 10 9 8 
29 24 10 8 
4 3 0 0 
1 1 0 Oo 





or 


Tonnes nn Fo 3 Wm @® 


2 ct 


m» © & G. A. 


| i... 


~~, 


-y, 


°* AD es ee 29 





we MI Iv aS” *" 


we 


AID TO STUDENT COUNSELORS 


ble-checking each item reveals that many students are aware 
of problems which they are not eager to discuss with a coun- 
selor. Many students feel inferior to their associates but do 
not express a desire to discuss this with a counselor. Many 
state that they lack self-confidence. Many consider that they 
have too few social contacts but would not like to talk to a 
counselor about this. In view of the many techniques avail- 
able for the counselor in dealing with the social problems of 
students at the University, the students appear to be turning 
away from possible assistance. Relatively more students sin- 
gle-check items related to reading problems than double-check 
these items. Inspection of these figures shows that students 
are aware of their educational and vocational problems and 
are frequently willing to talk about them. Although they 
often recognize social problems and personal problems they 
have little desire to discuss these with their counselors. 

This reluctance to discuss certain types of problem may be 
due to the fact that the students think that nothing can be 
done about these problems and that consequently time would 
be wasted in discussing them with a counselor. They may 
consider their personal problems too private to discuss with 
a relative stranger, but this would hardly explain the few 
double checks opposite the reading problems. When students 
come to the counselor, they come with one primary purpose 
and all other matters may appear irrelevant at that time. 
Coming to a counselor may follow or accompany some crisis 
in the life of a student, a failure or a forced change of voca- 
tional plans, and this crisis may envelop the entire horizon 
of the student. 

|The students included in this study had also filled out the 
Minnesota Personality Scale.) On this test scores are avail- 
able for morale, social adjustment, family adjustment, emo- 
tional adjustment, and economic conservatism. The score on 
the morale section is related to the individual’s emotional 
acceptance of surrounding social and community situations. 
Very high scores may indicate naive optimism, low scores cyn- 
icism or lack of hope for the future. Scores on the social ad- 


285 




































wept + 


SaaS WBE OM 5 











EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


justment section are related to the social maturity, gregari- 
ousness, and socialization of the individual. High scores on 
the family relations section indicate friendly and healthful 
child-parent relations. Scores on the emotionality section are 
related to emotional stability. Low scores on this section 
may often result from hypochondriasis, anxiety states, or over- 
reactive tendencies. Scores on the economic conservatism sec- 
tion are related to the liberality of the individual’s economic 
attitudes. 


On the basis of selected personality test scores, compari- 
sons were made between the means of those students who 


had checked items and those who had not checked those same 
items. For example, of the 193 men for whom complete data 
were available, 11 checked the item, “I do not know if an 
education is worth while.’’ This item was left unchecked by 
182 students. The mean personality scores for groups check- 
ing and not checking the items and their critical ratios (dif- 
ference divided by standard error of the difference) are pre- 
sented in Table 2. The items have been grouped into func- 
tional categories on an inspectional or logical basis. Only those 
items have been included which appeared relevant to the scores 
on the Personality Scale. 

On each item related to social behavior, the group check- 
ing the item had significantly lower scores on the social ad- 
justment test than did the group not checking the item. The 
men who indicated that they had too many social activities, 
however, had higher scores on the social adjustment section, 
as would be expected. Since many of the problems on the 
check list are very much like some of the items on the test, 
the observed relationship is not surprising. Students who 
claim that they do not get along well with their parents ob- 
tain significantly lower scores on the family relations section 
of the personality scale than students who do not check this 
item. However, men whose parents do not approve of their 
vocational choice do not obtain significantly lower scores on 
this section. Students who claim that they have been unable 
to make a satisfactory religious adjustment make no lower 


286 








TABLE 2 


COMPARISON OF MEAN SCORES ON PERSONALITY SCALE OF STUDENTS CHECKING AND 


NOT CHECKING PROBLEMS 

















MEN 
Mean of 
Section of Mean of Students 
Personality Dechisn — ba poe 
Scale Checked Item Check Item __ Ratio 
Morale 
I do not know if an education is worth 
WEE ao sanecaaeus baste epaanater Ee ees 156.55 157.45 .20 
Social 
I have too few social contacts............. 196.83 223.66 4.56 
I have difficulty in making friends........ 184.88 221.71 3.72 
I do not have enough to talk about in com- 
RE 50 tckemtinsh alas LAW ad ome pats 192.19 226.01 5.74 
I am frequently embarrassed when with 
EN atkoid boda cacihusedaneaee oaenade 182.06 222.19 3.70 
I usually do not enjoy being with members 
We I GIR ob a knnh as cacsdaese nn 187.54 220.90 2.94 
I am unable to do my work well because 
of too many social activities............. 239.80 217.50 2.86 
I usually do not know how to act in com- 
NN as sabe acice ooe 6:65 ek cxmen dio a sine eee 177.10 220.92 2.98 
B TGR SEE-COMRBOREE. 6... 655. ss ccc cecsssnceas 197.95 224.07 4.10 
Family 
I do not get along well with my parents... 99.66 120.06 2.59 
I have not obtained parental approval of 
HO WORARIOME! BIDNS. «6 5.0.656.0:0 00 eSiesecas 109.78 119.56 1.44 
Emotional 
I have been unable to make a satisfactory 
PRG. BOI aoa ia aos sk essadne ens 134.36 129.77 94 
I am frequently embarrassed when with 
EE cota datentanda a dpedaheneeeaeaeens 119.41 131.13 2.20 
I usually do not enjoy being with members 
SPE ee ee ret 128.23 130.23 42 
I feel guilty about something I have or have 
2 rere rere tre sity errr es. 118.57 131.00 2.26 
I lack self-confidence... .... .....00606s0000 121.70 132.29 3.15 
I am dissatisfied with my state of health. ..116.85 131.06 2.39 
WOMEN 
Social 
I have too few social contacts............. 182.47 201.19 2.29 
I have difficulty in making friends........ 162.56 201.90 5.20 
I have not enough to talk about in com- 
i REL Fr PORE POTN ET 177.00 203.96 3.99 
I am frequently embarrassed when with 
OE cao cond tte poiaace ko. scipee abasic 179.09 200.78 2.16 
E tnek self-confidence. «0.2.50 cccsccwscccs 179.97 207.58 4.76 
Family 
I do not get along well with my parents. ..102.38 139.02 6.76 
Emotional 
I have been unable to make a satisfactory 
religious adjustment .............++..+. 161.29 163.70 29 
I am frequently embarrassed when with 
Pee COO Fe EE Fe PE re 146.64 165.55 3.10 
I feel guilty about something I have or have 
eT error rer rer orc. 148.10 165.19 2.34 
GO ees este re 158.55 165.89 1.51 

























































EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


scores on the emotional adjustment scale than do other stu- 
dents. Guilt feelings apparently are related to the score on 
this section, and both men and women who check this item 
obtain significantly lower scores than those who do not check 
it. Men who claim dissatisfaction with their state of health 
obtain lower emotional adjustment scores, as do men who 
claim to lack self-confidence. Women who indicate a lack 
of self-confidence, however, do not differ significantly on the 
basis of this scale from other women. 

Of the 27 relationships analyzed, 21 were found to be 
statistically significant. Students tending to obtain low scores 
on the various sections of the personality scale will also tend 
to check related items on a problem list and thus supply the 
counselor with a clue regarding the source of these low scores. 
Perhaps the same information could be obtained by going 
through the items of the test, but as there are 218 items in 
the test, each with five possible answers, this would require 
much of the counselor’s time. A factor analysis of the items 
of the test would perhaps identify a few key items which could 
be used for the same purpose as the problem check list, but 
until this is done, the counselor may more economically glance 
at the 33 items on the check list than read through the many 
items of a personality test. 

Added to the statistical evidence concerning usefulness of 
the problem check list is much evidence obtained from clinical 
work involving the use of this instrument. The description 
of a few cases in which it has proved useful will exemplify 
this and also suggest various techniques that have proved suc- 
cessful in using the list in the interview. 

Joseph H. came to the Testing Bureau for assistance in 
deciding upon a major in the College of Education. He was 
completing his second year in the university and had been 
doing slightly better than average work. He had graduated 
from a small high school, and his social life had been very 
restricted in the little town from which he came. Among 
other items, he double-checked that he had too few social 
contacts. His percentile score on the social adjustment section 


288 








AID TO STUDENT COUNSELORS 


of the Personality Scale was 24. After discussing the boy’s 
vocational plans, the counselor said, “J see that you check 
that you have too few social contacts. What do you think 
you could do about that?”’ Joseph started to discuss the fa- 
cilities available on the campus and soon he and the counselor 
had a social program planned, and the counselor gave him a 
letter of introduction to the secretary of the Y. M. C. A. 
The item checked by the boy in this case gave the counselor 
an opportunity to approach a problem the existence of which 
might have been easy to determine but for which treatment 
might not have been initiated so easily without the item. 

George S. had checked several items on the problem check 
list, including the one, “I feel guilty about something I have 
or have not done.”’ A single check had been placed opposite 
this item. During the interview the counselor decided that 
the boy presented a picture of a very unstable individual and 
that various personal problems might interfere with his prog- 
ress when he entered college the following fall. The coun- 
selor was unable, however, to get the boy to discuss these 
problems. Finally, he said, “I see you checked here that you 
feel guilty about something you have done or have not done.” 
After a pause he continued, “Many people feel guilty about 
things they have done, and usually feeling guilty about it is 
the only thing that does any harm.” He paused again, and 
George began to speak of the problems that had been worry- 
ing him and of his reactions to these problems. In this case, 
the problem check list provided an opportunity for the coun- 
selor to approach a problem which had previously resisted 
all attempts to approach it. 

We have found that when an item-is double-checked, the 
most convenient and profitable thing the counselor can do is 
to refer directly to the item and give the student an oppor- 
tunity to elaborate upon his response. When only a single 
check is placed opposite the item, however, this can seldom be 
done. The counselor will have to remember that the student 
did not indicate that he wanted to talk about the subject 
checked and that he may actually resent any attempt on the 


289 














EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 
part of the counselor to start such a discussion. When an 
item has been checked only once, the counselor can often 
discuss the problem mentioned and give the student an oppor- 
tunity to ask questions without making the student aware that 
the item itself is being referred to. 

Sarah W. placed a single check opposite the item, ‘I have 
not obtained parental approval of my vocational plans.”’ Dur- 
ing the interview, after discussing various alternatives, the 
counselor asked, ““What do you think your parents would like 
you to do?” Sarah then told what her parents’ reactions were 
and also revealed a family problem which had not even been 
suspected up to that point in the interview. If the counselor 
had asked, “Why don’t your parents approve of your voca- 
tional plans?’’, it is doubtful if Sarah would have given the 
information she actually gave. 


Summary 


Statistical analysis of a problem check list and its clinical 
use have shown that it is a useful instrument in diagnosing 
students’ problems and in approaching these problems in the 
interview. The items checked offer the counselor an oppor- 
tunity to select those areas which offer most promise for in- 
vestigation and to introduce these topics in the counseling 
interview. The items also assist in orienting the student to- 
ward the counselor and in reaching a definition of his prob- 
lems before the interview. 


REFERENCES 


(1) Moody, R. Problems Check List. Columbus, Ohio: Ohio Uni- 
versity Press, 1941. 

(2) Symonds, P. K. “Life Problems and Interests of Adolescents,” 
School Review, XLIV (1936), 506-518. 

(3) Williamson, E. G. How to Counsel Students. New York: Me- 
Graw-Hill, 1939. 

(4) Williamson, E. G. and Darley, J. G. Student Personnel Work. 
New York: McGraw-Hill, 1937. 

(5) Wrenn, C. G. Study-Habits Inventory. California: Stanford Uni- 

versity Press, 1941. 





-“ 











A COMPARISON OF THE HUMAN BEHAVIOR 
INVENTORY WITH TWO OTHER PERSON- 
ALITY INVENTORIES 


ABRAHAM SPERLING 
City College of New York 


ENCIL-AND-PAPER TESTS for diagnosing personal- 
Pi. traits have too frequently proved unsatisfactory to the 
investigators employing them. Statements expressing discon- 
tent with the diagnostic results of such tests are found in 
studies by Watson (1), Mosier (2), Landis (3), Moore and 
Steele (4), Feder and Mallet (5), Gorham and Brotemarkle 
(6), Stagner (7), and others too numerous to include here. 
Accompanying the criticisms, however, constructive sugges- 
tions are frequently made for the improvement of such instru- 
ments. Among the suggestions offered are the use of multiple 
answers, the use of weighted scoring, the development of reli- 
ability, better definition of terms, and abstention from scoring 
the same items for more than one trait. Because it is felt that 
the Human Behavior Inventory,’ devised by Randolph B. 
Smith, represents an improvement in adjustment scales in ac- 
cordance with these suggestions, it is the desire of the investi- 
gator to bring this instrument to the attention of possible 
users. 

Employed in an experimental study (8) conducted by the 
investigator, the Human Behavior Inventory proved to be a 
most satisfactory instrument for measuring traits of person- 
ality adjustment. It was devised for the purpose of testing per- 
sonality adjustment of a group of college students. Smith de- 
veloped the instrument by selecting from previous inventories 
the items found most diagnostic, modifying them in an effort 


4 pe: | 
~~ = 





1This inventory is reproduced in a monograph by Smith (9). 


291 








EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 





to make them capable of measuring status as well as change, 
and adding new items where necessary. In his original study, 
Smith (9) employed the test to measure the personality status 
of college students at the beginning and end of a school year 
in which the individuals had been subjected to a course in 
mental hygiene. 
Description of the Test 

The inventory was developed to yield a total score which 
might serve as a measure of general personality adjustment, 
together with separate subscores on six individual sections 
(1—work efficiency, 2—superiority-inferiority or degree of 
self-confidence, 3—social acceptability and adjustment, 4— 
emotional stability with reference to neurotic symptoms, ease 
of adjustment to new experiences, and general sex adjust- 
ments, 5—objectivity toward behavior of others, and 6—fam- 
ily attitudes and relationships) which may be regarded as ma- 
jor characteristics of mental health and emotional maturity. 
The scale contains 102 questions, answers to which are based 
on a five-degree multiple choice. Each item is given a score 
ranging from 0 to 4, depending upon the degree of the answer. 
The estimated reliability of the total test score by odd-even 
correlation for 1125 cases was .89 + .01 and by test-retest 
correlation for 465 cases after six months was .81 + .01. 


Procedure 


This investigation was undertaken to compare the Human 
Behavior Inventory with previously constructed scales. Sev- 
eral of the instructors in elementary psychology at the College 
of the City of New York administered to their classes the 
Human Behavior Inventory, the Bell Adjustment Inventory 
(10), and the short form of the Thurstone Personality Sched- 
ule as revised by R. R. Willoughby, which is known as the 
Clark-Thurstone Inventory (11). Statistical data concerning 
these scales are described in the bibliographical references 
noted. 

To each class in elementary psychology both the Human 
Behavior Inventory and the Clark-Thurstone Inventory were 


292 





ar 








e, 


US 
ir 


eo.” =» 


rh 








HUMAN BEHAVIOR INVENTORY 


given during the same period. The Bell Inventory was given 
during the subsequent class meeting. One hundred seven com- 
plete sets of inventories were made available to the investi- 


gator. 
The Data 


The intercorrelations of the three scales are presented 
in Table 1, while other statistics concerning each inventory 
are given in Table 2. 

TABLE 1 


INTERCORRELATIONS AMONG HUMAN BEHAVIOR INVENTORY, BELL 
INVENTORY, AND CLARK-THURSTONE INVENTORY 








No. of Items in No. of Identical 





Coefficient of Respective Items Between 
Inventories Correlation Scales Scales 
Human Behavior 102 12 
Inventory .736 + .030 
and Bell Inventory 140 
Clark-Thurstone 25 
and .748 + .029 7 
Human Behavior Inventory 102 
Bell Inventory 140 
and .785 + .026 18 
Clark-Thurstone 25 





TABLE 2 


SCORES ON HUMAN BEHAVIOR INVENTORY, BELL INVENTORY, AND 
CLARK-THURSTONE INVENTORY* 








Coefficient of 





Scales Range Mean S. D. Reliability 
Human Behavior 36-202 120 = 38.55 .918 
Inventory (123) ~ (39.23) (.89) 
Bell Inventory 1-77 33.2 16.50 (.93) 

(32) 
Clark-Thurstone 2-67 26 15.07 (.91) 
Inventory (29) (13.70) 
Age 17-24.6 19.5 N = 107 


*Figures in parentheses are from the original studies by the authors. 


293 











EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


The fact that the Human Behavior Inventory correlates 
rather highly with the two older scales should not give the 
impression that it is a duplication of the exact content of the 
scales with which it was compared. However, the high cor- 
relations probably indicate that it tends to measure the same 
factors, namely, those of personality adjustment. To check 
whether the high correlations were due to mere identity of the 
items in the scales, an analysis of the three questionnaires was 
made. 

The analysis (summarized in Table 1) showed that there 
were eighteen identical items in the Bell and Clark-Thurstone, 
seven in the Human Behavior Inventory and the Clark-Thur- 
stone, and twelve in the Human Behavior Inventory and the 
Bell. While it is thus seen that exact identity of items alone 
does not provide a reasonable explanation of the fairly high 
correlations between the scales, it is of course recognized that 
similarity of items not identical may also be a factor. 

The intercorrelations among the sections of the scales 
have been reproduced for two reasons: first, to offer a record 
of the data for the benefit of others who may wish to make 
comparisons and, second, to demonstrate the similarity of re- 
sults obtained in this study and in the original studies by the 
respective authors. To illustrate, Tables 3 and 4 show the 
closeness of the mean scores and intercorrelations obtained 
from the 107 subjects of this study to those of the 1145 of 
Smith’s study and the 258 of Bell’s study. The rather high 
interrelationships among the parts of the Human Behavior 
Inventory may be an indication that the sub-scores do not 
represent separate psychological factors. However, it is pos- 
sible that they are indicative of truly close relationships 
among the several personality traits measured by the subsec- 
tions. Further exploration of these possibilities may well be 
the subject of a subsequent investigation. 

It may be pertinent to mention at this point that in the 
opinion of the investigator the importance of establishing rap- 
port between experimenter and subject for best results from 
pencil-and-paper tests of personality cannot be overempha- 


294 


te i i i 


i et ia ee 








HUMAN BEHAVIOR INVENTORY 

























































’ 
TABLE 3 
- ’ CORRELATIONS OF PARTS OF BELL INVENTORY WITH EACH OTHER 
: AND WITH TOTAL* 
‘ Parts Health Social Emotional Total 
, Home 39 21 54 757 
(.43) (.04) (.38) 
Health 18 45 .629 
' (.24) (.53) 
Social 44 655 
: (.47) 
e Emotional .832 
*Figures in parentheses are from the original study by the author. 
; 
TABLE 4 
CORRELATIONS OF PARTS OF HUMAN BEHAVIOR INVENTORY WITH 
' EACH OTHER AND WITH TOTAL* 
4 
4 oe: E 
: PS = ~o on - 
z » 2 $ - . 2 oa 
= PY = ony rs b 8; at ce. 
= = S E = = = av ae os 
a D F 35) © i = = 2 BG Z 
Work .60 47 -56 42 38 58 508 12.87 4.31 
Eff. (.59) (.52) (.56) (.40) (.44) (.68) (12.94) (4.90) 9 
’ Sup. 68 73 59 43 78 772 «14.12 5.86 
Inf. (.66) (.65) (.46) (.46) (.76) (15.57) (5.96) 11 
Soc. 76 36) $3 80 797 16.86 6.66 
Acc. (.72) (.47) (.54) (.80) (17.75) (7.05) 13 
j Emot. -66 .64 92 908 28.59 11.20 
Stab. (.63) (.60) (.88) (30.22) (11.55) 29 
Obj. 45 PB 615 18.58 7.43 
(.55) (.74) (18.08) (6.93) 14 ; 
Fam. .78 607 29.43 12.26 
’ Rel. (.85) (28.50) (11.85) 26 


*Figures in parentheses are from the original study by R. B. Smith. 
**The abbreviations of the subsection names refer to Work, Efficiency, Supe- 
' riority-Inferiority, Social Acceptability, Emotional Stability, Objectivity, and 
Family Relationships. 


sized. In this study, extreme care was taken in the matter of 
rapport. In the original instructions each student was asked 
to volunteer his efforts in a research study that would have no 
bearing on his grades or standing at the college. Each indi- 


295 








EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


vidual was told that his replies would be treated entirely con- 
fidentially and anonymously unless he desired otherwise. He 
was asked to be sincere and objective and to inform the in- 
vestigator if he felt his rapport was not valid. As an added 
incentive for an honest expression of their own characteristics 
as they know them, the students were told that they would be 
given the results of their tests in such a manner that they 
could compare their scores with the average of others taking 
part in the study if they so desired. A majority of the stu- 
dents were known to the investigator in a rather friendly stu- 
dent-teacher relationship. It is the opinion of this investigator 
that disappointing results from the use of personality ques- 
tionnaires are frequently due to a lack of rapport between 
experimenter and subjects. 


Summary and Conclusion 


The coefficient of correlation between the Human Be- 
havior Inventory and the Bell Inventory-was .736, that be- 
tween the Human Behavior Inventory and the Clark-Thur- 
stone Inventory .748, and that between the Bell and the Clark- 
Thurstone .785. An analysis of the three measures showed 
more similar items between the Bell and the Clark-Thurstone 
scales than between the Human Behavior Inventory and either 
of these measures. 

In view of the similar positive coefficients of intercorrela- 
tion among the three scales, it may be concluded that the 
Human Behavior Inventory is probably as satisfactory for use 
as a diagnostic measure of personality adjustment as either 
of the other two measures with which it was compared. More- 
over, the scale embodies several desirable features such as the 
use of multiple answers, weighted scoring, high reliability, 
clear definition of terms, and abstention from scoring the same 
items for more than one trait. Since these aspects are among 
the suggestions made by authorities for the improvement of 
personality scales, they lend support to the acceptance of the 
Human Behavior Inventory as an instrument for measuring 
traits of personality adjustment. 


296 











HUMAN BEHAVIOR INVENTORY 


REFERENCES 


. Watson, G. “Personality and Character Measurement,” Review 


of Educational Research, VIII (1938), 269-291. 


Mosier, C. I. “On the Validity of Neurotic Questionnaires,” 
Journal of Social Psychology, 1X (1938), 3-16. 


Landis, C. ‘Empirical Evaluation of Three Personality Adjust- 
ment Inventories,” Journal of Educational Psychology, XXVI 
(1935), 321-330. 


Moore, H. and Steele, I. ‘Personality Tests,” Journal of Abnor- 
mal and Social Psychology, XX1X (1934), 45-52. 


Feder, D. and Mallet, D. “Validity of Certain Measures of 
Personality Adjustment,” Journal of American Association of Col- 
lege Registrars, XIII No. 1 (1937), 5-15. 


Gorham, D. R. and Brotemarkle, R. “Challenging Three Stand- 
ardized Emotionality Tests for Validity and Employability,” Jour- 
nal of Applied Psychology, XIII (1929), 554-588. 


Stagner, R. “The Intercorrelation of Some Standardized Person- 
ality Tests,” Journal of Applied Psychology, XVI (1932), 453- 
464. 

Sperling, A. The Relationship between Personality Adjustment 
and Achievement in Physical Education Activities, Doctoral dis- 
sertation, 1941. On file in the library of New York University, 
New York. 


Smith, R. B. Growth in Personality Adjustment Through Mental 
Hygiene, Albany, New York: University of the State of New 
York, State Education Department, 1936. 


Bell, H. M. Manual for the Adjustment Inventory, Stanford 
University, California: Stanford University Press, 1934. 


. Willoughby, R. R. “Some Properties of the Thurstone Person- 


ality Schedule and a Suggested Revision,” Journal of Social Psy- 
chology, III (1932), 401-424. 














INTRA-INDIVIDUAL DIFFERENCES VERSUS 
INTER-INDIVIDUAL DIFFERENCES 
IN MOTOR SKILLS! 


WILLIAM A. OWENS, JR. 


Iowa State College 


TUDIES OF VARIATION within and between indi- 
SS viduals have tended to be restricted to one sort of intra- 
individual variation, trait differences. They have also tended, 
in treating of the relative magnitudes of individual differences 
and trait differences, to display adherence to one of two modes 
of attack. Either they have dealt with the inter-correlations 
of certain traits or functions, or they have shown a comparison 
of a trait standard deviation with a standard deviation rep- 
resentative of individual differences. 

The present paper is an attempt to evaluate trait differ- 
ences and certain other intra-individual factors, and to relate 
them in magnitude to individual differences. 

The writer feels that the statistical technique which was 
employed in the present investigation was superior to either 
of the two which are conventionally used for the reasons which 
follow. First, neither the method of inter-correlation nor the 
method of comparing standard deviations will allow of the 
treatment of more than one intra-individual factor. Second, 
even in the comparison of individual and trait differences, the 





1This article is a condensation of the writer’s doctoral dissertation of the 
same title, a copy of which is on file at the library of the University of 
Minnesota. 

The writer wishes to acknowledge the invaluable criticisms and suggestions 
of his advisors, Professor D. G. Paterson and Dr. P. O. Johnson. He also 
wishes to recognize the assistance of Dr. Brent Baxter and of Mr. Paul G. 
Homeyer. 

The actual experimental work was done in the psychological laboratories at 
the University of Minnesota with the cooperation of Dr. M. A. Tinker, and was 
made possible through a research grant by the Graduate School of that in- 
stitution. 


299 








EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


magnitude of the correlation coefficient is conditioned by at 
least two variables besides the true magnitude of the relation- 
ship. These are unreliability of measurement and trait vari- 
ability. Third, product-moment correlation, or its equivalent, 
deals with absolute deviate ranks; no account of variation 
within these rank positions is taken so long as they do not 
actually shift.* Fourth, standard deviations are increased by 
unreliabilities of measurement. If a systematic error were 
present, the error in measuring trait differences would not be 
equal to the error in measuring individual differences. Even 
if this were not the case, a constant error factor would be a 
relatively larger component of trait differences—if it were the 
smaller—than of individual differences. A statement of pro- 
portionality could, thus, only be accurate if individual and 
trait differences were of the same magnitude. 

The present experiment, which was designed for applica- 
tion of the analysis of variance, was planned with a view to 
taking account of these several objections to other techniques. 
In accordance with this purpose, certain facts are worth 
noting. First, account is taken of more than one intra-indi- 
vidual factor. Second, various sorts of unreliability are incor- 
porated in the estimate of error with at least two relevant 
consequences: the estimates of the magnitudes of the main 
factors are correspondingly more accurate, and direct tests of 





2Since terminology in this field has not been entirely uniform, the writer 
includes his own definitions of the terms he employs. 

Inter-individual differences = differences in relative proficiency from indi- 
vidual to individual. Intra-individual differences = (1) trait differences, (2) 
repetitive variations, (3) trait variability, et al. 

Trait differences = differences in relative proficiency from function to func- 
tion within the individual. 

Repetitive variations = changes in the individual’s proficiency from day to 
day in the average of all functions measured. The systematic portion of this 
shift might be designated as learning, and the random portion attributed to 
shifts in the subject’s efficiency. 

_ Trait variability =the term used by Paulsen to denote the fluctuation of a 
given function within an individual, temporally. See Paulsen, G. B. “A Co- 
efficient of Trait Variability,” Psychological Bulletin, XXVIII (1935), 218-19. 


3Harris has taken account of such a contention in developing his method of 
relative correlation. The procedure is to correlate a first variable with the 
deviation of a second from its most probable value. Harris, J. A. “The Corre- 
lation Between a Variable and the Deviation of a Dependent Variable from 
its Most Probable Value,” Biometrika, VI (1908), 438-443. 


300 





>, #*~, 


0 peed feed eet | 


ew ee ee, UO) et 











INTRA-INDIVIDUAL DIFFERENCES 


the significances of these factors are made possible. Third, the 
analysis is so based as to minimize the errors normally in- 
curred in the ranking of data, while the statistical technique 
employed takes account of the total variation—none of it goes 
exempt from analysis. With this brief preface, the present 
experiment may be outlined. 

The Problem—To obtain an estimate of the relative mag- 
nitudes of individual differences and of several intra-individual 
factors on some tests of motor skills. 

The Method—Table 1 provides an abbreviated illustra- 
tion of the technique employed to determine the per cent of 
the total variation in score contributed by individual differ- 
ences. 




















TABLE 1 
A SAMPLE ANALYSIS 
Individuals Administrations—Block Packing 
(15) II III IV a ee VIII 
A 425 448 425 
B 502 469 530 
Cc 514 541 648 
D 384 403 392 Norm 
E 345 436 463 M = 500 
F 512 467 577 o = 100 
G 511 611 623 
H 241 S32 428 
I 522 573 548 
J 453 498 567 
K Si7 380 302 
L 461 547 517 
M 394 478 496 
N 534 567 631 
O 381 44] 491 
Correction Term = (21365)2/45 = 10,143,627.22 
Total Variation = 356,569.78 


Individual Differences = 272,389.11 
Repetitive Variations — 44,667.51 








Error (Interaction) = 39,513.16 

Degrees of Sum of Mean 
Factor Freedom Squares Square — P % 
pf aa 356,569.78 100 
ID. 14 272,389.11 19,456.37 13.79 <.01 68 
R.V. 2 44,667.51 22,333.76 15.83 < .01 16 
Err. 28 39,513.16 1.411.18 16 


301 








EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


The analysis of variance was employed with two criteria 
of classification — individuals and administrations.* The 
isolates from the total variation (T.V.) were individual dif- 
ferences (1.D.),° repetitive variations (R.V.), and an esti- 
mate of error (Err.). Especially to be noted is the fact that 
the per cent column furnishes an estimate of the magnitude of 
the contribution of individual differences to the total variation. 
An analysis of this sort was run for each test of the present 
experiment. 

The analysis of intra-individual factors took a similar 
form. Table 2 illustrates the method and the character of 
the second series of analyses. The isolates from the total 
variation (T.V.) were trait differences (T.D.), repetitive 
variations (R.V.), and error (Err.). An analysis of this sort 
was made for each subject in the experimental group. The in- 
tention was ultimately to compare the per cent contribution of 
individual differences to the total variation in the first series 
of analyses with the respective per cent contributions of trait 
differences and repetitive variations to the total variation in 
the second series of analyses. 

Seven tests of motor skills were employed in this investi- 
gation. They were the block packing, steadiness, speed of 
movement, slow movement, stick balancing, tapping, and card 
sorting tests reported in the Minnesota Mechanical Ability 
Study. Each one of 15 subjects was given eight administra- 
tions of each of the seven tests — 56 testings per subject.® The 
tests were administered in systematically varied order on a 
schedule calling for about eight hours of each person’s time. 
Subjects were junior high school boys matched for age, intel- 
ligence, and race, both with each other and with a norm group 
of 216 individuals.?: These boys were paid for their time, and 





4C. H. Goulden, Methods of Statistical Analysis (New York: John Wiley 
and Sons, 1939), pp. 114-141; especially p. 127. 

5From this type of analysis only the individual differences factor figures in 
later comparisons. 

6Results from all first administrations were, as usual, discarded as un- 
reliable. 

TD. G. Paterson, R. M. Elliott, et al. Minnesota Mechanical Ability Tests 
(Minneapolis: University of Minnesota Press, 1930), p. 586. 


302 


— 


$$ Kr 





see em PR teh 


aa 








a] 





a 


INTRA-INDIVIDUAL DIFFERENCES 




















TABLE 2 
A SAMPLE ANALYSIS 

Tests Administrations — Subject E 

(6) II Ill a «sc VE 
Block Packing 345 436 463 
Steadiness 370 370 306 Norm 
Slow Movement 512 618 583 M = 500 
Speed of Movement 652 712 649 o = 100 
‘Tapping 557 601 619 
Stick Balancing _—_520 525 538 
Correction Term = (9376)2/18 = 4,883,854.22 
Total Variation = 234,357.78 
Trait Differences = 213,415.11 
Repetitive Variation = 8,069.78 
Error (Interaction) = 12,872.89 

Degrees of Sum of Mean 

Factor Freedom __ Squares Square a 4 %o 
TY. 17 234,357.78 100 
T.. 5 213,415.11 42,683.02 33.16 <.01 89 
RV. 2 8,069.78 4,034.89 53 Pg 3 
Err. 10 12,872.89 1,287.29 8 





*See a later reference on combining independent probabilities. 





several prizes were awarded at the completion of the testing. 
Motivation appeared to be excellent. 

Two methodological issues now demand attention. First, 
in order to evaluate trait differences — differences in the sub- 
jects’ relative proficiency from test to test — the various tests 
themselves had to be equated. This was accomplished in the 
following fashion. The seven norm distributions of 216 cases 
each were checked for normality, and the five which departed 
from the criterion were normalized. The pertinent data are 
included in Tables 3 and 4. These distributions were then 
assigned comparable scores after the “method described by 
Hull, McCall, et al.* Specifically, each distribution was as- 
signed a mean of 500 and a standard deviation of 100. The 
scores of the subjects in the present experimental group were 
converted to this form and evaluated in terms of these equated 





8C. L. Hull, “The Conversion of Test Scores into Series Which Shall Have 
Any Assigned Mean and Degree of Dispersion.” Journal of Applied Psy- 
chology, VI (1933), 298-300. 


303 











EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 














TABLE 3 
NORMS 
Trans- 
Test M o formation Value N 
Block Packing* 2.739 .064 logarithm low score 217 
pv. Ay 549 83 is good 
ORF." 571 87 
Slow Movement* 1.518 .349 logarithm low score 217 
T.S.V. 22 26 is good 
O.S.V. 41 34 
Steadiness* 2.382 .500 square root high score 217 
T.S.V. 5.67 2.38 is good 
O.S.V. 6.18 2.48 
Tapping* 20.514 .898 square root high score 217 
TS.V. 421 34 is good 
O.S.V. 426 37 
Stick Balancing* 1.089 .205 square root high score 216 
T.S.V. 15 21 of logarithm is good 
O.S.V. 35 72 
Speed of 
Movement 150.06 30.138 none high score 217 
T.S.V. 150.06 * is good 
O.S.V. 150.06 es 
*= value in transformed distribution. 
**T.S.V. = transformed score value. 
***O.S.V. = original score value. 


norm distributions. Any tendency to minimize the magnitude 
of trait differences may thus be viewed as a function of the 
sampling error of a mean of 216 cases.® 

The second methodological consideration was the time- 
honored one of securing a zero point and equal units of meas- 
urement on the scale for the evaluation of individual differ- 





%A trial analysis of the scores on the speed of movement test was made 
using both the original and the transformed measures. The relative magnitudes 
of the respective factors were identical to two decimal places by the two 
methods. Apparently, none of the information latent in the data is lost through 
the transformation. 


304 





eee 


—_—$— <r 







































INTRA-INDIVIDUAL DIFFERENCES 














) 
TABLE 4 
TESTS OF NORMALITY 
. Tests N G, G %G, “is, P 
Block Packing ...... 217. «0.315 =0.529 +0.1655 0.3286 >.01 
: Slow Movement ....217 0.761 0.845 . i E 
‘ Speed of Movement..217 0.312 0.049 is 4 
. | RGR 217 0.169 0.356 " 5 . 
eee ere 217 + 0.362 0.144 “3 ss 5 
Stick Balancing...... 216 0.040 0.145 “ ye 
! Card Sorting ....... 219 0.253 0.017 later omitted 
G, and G, are calculated from R. A. Fisher’s K statistics. A com- 
} plete description of the method is to be found in Goulden, C. H. 


Methods of Statistical Analysis (New York: John Wiley & Sons, 
: 1939), pp, 27-31. 


ences. This would, of course, be necessary in order to justify 
the ultimate pooling of the results. Anastasi’® has pointed out 
that standard scores from scaled, or normal, distributions tend 
to yield such equal units of measurement. Also, the analysis 
of variance deals only with deviates or differences, and not 
with absolute magnitudes. These two considerations seem to 
point to the adequacy of the present data and technique for 
the purpose in view. 

It would have been ideal to establish the normality of the 
distribution of trait differences within each individual in 
similar fashion. However, the number of traits measured was 
so small that this constituted a practical impossibility. Hull" 
has stated that the distribution of trait differences appears to 
be a normal one. The data of the present study would affirm 

this opinion, although no conclusions may be based on the 
inspectional method employed. In any case, the error, if any, 
introduced by assuming the normality: of the distribution of 
} 
} 


a gs 


trait differences would be very slight. 

On the assumption that a satisfactory estimate of the rela- 
tive magnitudes of individual and trait differences might be 
obtained, an attempt was made to isolate certain other sources 





(1933-34), No. 5. 
11C, L. Hull, “Variability in Amount of Different Traits Possessed by the 
Individual,” Journal of Educational Psychology, XVIII (1927), 97-106. 


305 


‘ 
} 
10A. Anastasi, “Practice and Variability,” Psychological Monographs, XIV 


EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


of intra-individual variation. In the second type of analysis, 
illustrated in Table 2, the repetitive variations factor (R.V.) 
is seen to be composed of differences between the means of 
administrations. A direct estimate of the magnitude of this 
factor was obtained, as in the case of trait differences, by 
determining its mean per cent contribution to the total varia- 
tion from each separate analysis of the second type.’* How- 
ever, the differences between the means of the administrations 
may be viewed as being attributable to two distinct sources: 
first, learning; and second, random fluctuations in the indi- 
vidual’s efficiency from day to day in all functions. An attempt 
was made to differentiate between the two. 

Briefly, it was assumed that learning would be at a maxi- 
mum on administrations 2-4, and at a minimum and negli- 
gible level on administrations 6-8. The evidence for this 
assumption follows. (1) If the proposed dichotomy in ad- 
ministrations (2-4 vs. 6-8) is made, the repetitive varia- 
tions factor is significant in the second summary analysis of 
administrations 2-4, and is insignificant in the second sum- 
mary analysis of administrations 6-8. Table 5 gives the 
relevant data, and the probability (P) column illustrates the 
point. (2) Also to be noted in Table 5 is the fact that the 
repetitive variations mean square is only slightly larger than 
the error mean square for administrations six through eight. 
(3) The establishment of a common unit for the estimation 
of improvement makes it apparent that most learning is con- 
fined to administrations 2-5. The “t’” test is generalized in 
Fisher’s concept of fiducial probability to yield an expression 
as to the magnitude which any difference must attain to be 
“significant’’ at any given level of probability.” Specifically, 
in the present instance, the fiducial limits at the 10 per cent 





12Qne analysis for each subject in the experimental group; each one in the 
form illustrated in Table 2. It makes no essential] difference in the results 
whether a per cent is obtained in each analysis and the mean of the series 
obtained, or whether the sums of squares and degrees of freedom are totaled 
and one per cent computed from these “summary statistics.” The latter method 
is probably preferable for purposes of estimation. 

13R. A. Fisher, Statistical Methods for Research Workers (London: Oliver 
& Boyd, 1937). 


306 





FE TT, 


oe 





eee 






































INTRA-INDIVIDUAL DIFFERENCES 


TABLE 5 
REPETITIVE VARIATIONS 


LEARNING VS. RANDOM FLUCTUATIONS 




















Degrees of Sum of - Mean 

Factor Freedom Squares Square io aa if % 

; Administrations 2-4 

ts. a 2,828,723.05 100 

ye oS 75 2,441,890.33 32,558.54 18.68 <.01 83 

RS. 30 125,390.68 4,179.68 2.40 <.01 3 

Err. 150 261,442.04 1,742.95 14 
Administrations 6-8 

iy. 20 2,861 ,646.86 100 

fie 75 2,552,178.81 33,629.05 18.22 <.01 = 85 

RV 30 62,656.82 2,088.56 1.13 >.05 0.3 

Err. 150 276,811.23 1,845.41 15 

*These are summary statistics derived by totaling the sums of squares and 4] 


degrees of freedom of the separate analyses. 


level were used as a qualitative, common unit for the measure- 
ment of improvement from administration 2 through admin- 
istration 8. Table 6 contains a summary on this point. 

It should be noted that “improvement”’ in each individual 
case is defined in terms of the amount of variation which may 
be viewed as “random.” In accordance with the previously 
stated hypothesis, it will be observed that there is only one 
exception to the rule that learning, if present, tends to be 
confined to the first five administrations.* In view of the 
evidence presented, it was assumed that the difference between 
the initial (2-4) and final (6-8) magnitudes of the repetitive 
variations factor might furnish an estimate of the relative 
importance of learning. 

Finally, it can be shown that the estimate of error, or in- 
teraction, in the second series of analyses may have as many 
as three experimental components. These are: (1) unreliabili- 
ties of measurement, presumably inherent in the test; (2) 
trait variability, presumably inherent in the individual; and 
(3) differential rates of improvement within the individual 

14Tt seemed best to omit administration 5 because it appeared to be at the 


inflection point on the learning curve. At best, this method may tend to under- 
estimate slightly the role of learning. 


307 





EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 











TABLE 6 
FIDUCIAL LIMITS OF LEARNING 
Indi- Administrations — Average of 6 Tests Limits 
viduals II III IV V VI VII VIII 10% 








519 541 559* 576 592 594 583 54 
573 547 564 602** 613 602 613** 38 
569 582 597 636* 621 625 or & 
543 581 599 635* 662 673 faa" 74 
493 544* 526 626 593 569 617** 48 
628 648 664 667* 684 685 673** 39 
566 612 653* 666 677 651 684** 49 
521 598* 618 613 640 725* 666 55 
518 561 578* 580 591 598 > aig | 
600 612 630 633* 631 629 oss** 32 
537 571 562 592* 599 626 616** 45 
550 605* 589 605 620 630 652** 45 
541 553 556 555** 554 563 569** 43 
590 596 586 610** 627 623 632°" 38 
467 481 525* 532 564 549 650** 50 
T = 12/15 T = 1/15 
Key: *= point at which difference from score on administration 2 or 6 becomes 
as great as fiducial limits. 
**—hno difference as great as fiducial limits from administration 2 or 6 
through given point. 


OZSCA- “TOMO DD> 





from function to function. (2) and (3) are, of course, sources 
of intra-individual variation. However, since it was not pos- 
sible in the present instance to differentiate satisfactorily the 
various factors contributing to the estimate of error," it was 
thought most parsimonious to exclude them from considera- 
tion as separate entities assignable to either intra- or inter- 
individual sources. 


The Results—It should be stated at the outset that these 
results will be based on only seven administrations of each of 
the various tests, the first administrations being discarded as 
unreliable. They will also include only six tests in the primary 
analysis, since the norms for the card sorting test were found 
to be unsatisfactory. 

Since the present sample is necessarily small, it is interest- 
ing to note these evidences of its representativeness. First, the 





15The number of cases would be rather too small to give the curve-fitting 
methods much significance. 


308 





nm Ii tht =e = DW AF DS 


_, 

















INTRA-INDIVIDUAL DIFFERENCES 


average standard deviation of the experimental group was 
over 90 per cent as large as that of the unmatched and un- 
selected norm group. Second, the mean scores of the experi- 
mental group for administration number two are practically 
identical with those of the norm group. Three, the sample 
was split to allow an estimate of its internal consistency. 
Table 7 shows the result. This sort of consistency is surely 
one evidence of representativeness. These facts, combined, 
suggest the adequacy of the sample. 














TABLE 7 
CONSISTENCY OF SAMPLE 
~~ Factor G, G, 
I.D. 69% 75% Analysis I 
T.D. 76 77 Analysis II 
R.V. 7 7 
Err. 17 16 


(Average proportions of total variation from the analyses of both series) 
G, and G, = respective halves of sample. 





A fundamental assumption in the application of the 
analysis of variance is that experimental error is distributed 
with uniform though unknown variance about a mean of zero. 
Accordingly, Nayer’s test’® for homogeneity of variance was 
applied to these data to check this hypothesis. In all cases, the 
value of L failed to reach even the 5 per cent level, which 
means that the variances within groups are the same. Statis- 
tically, this result justifies the application of the proposed 
method to these data. 

Before turning to the results proper, it should be noted 
that they have a dual methodological aspect. Actually two 
separate problems exist; one is a problem of determining 
significance, and the second a problem of estimating magni- 
tudes. The second problem ceases to exist if the first is not 
satisfied. In accordance with this fact, the discussion of 
significance will precede that of estimation in what follows. 





16p, P. N. Nayer, “An Investigation into the Application of Neyman and 
Pearson’s L Test, with Tables of Percentage Limits,” Statistical Research 
Memoirs, I (1936), 38-56. 


309 











EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


TABLE 8 


INDIVIDUAL DIFFERENCES—SU MMARY 





Degrees of Sum of Mean 
Factor Freedom Squares Square ig 4 % 





T.V. 618 5,373,669.90 
I.D. 78 3,713,633.72 47,610.69 23.43 <.01 72 





R.V. 72 709,019.91 9,847.50 
Err. 468 951,016.27 2,032.09 
TABLE 9 


TRAIT DIFFERENCES—SU MMARY 








Degrees of Sum of Mean 
Factor Freedom Squares Square i P N 





T.V. 615 7,148,910.67 

7a 82 5 488,873.51 73,581.62 33.68 <.01 77 
RV. 90 682,345.47 7,581.62 349 <.01 7 
Err. 450 977,691.67 2,172.65 16 








Tables 8 and 9 show the summary statistics for the two 


major series of analyses. These statistics were derived by 


adding the sums of squares and degrees of freedom from the 
separate analyses of the type illustrated in Tables 1 and 2. It 
should be observed that each of the factors is significant above 
the | per cent level, and that the per cent column contains the 
best available estimate of the respective magnitudes. These 
per cents are derived from the mean squares for the respective 
factors, minus the error mean square, and divided by a correc- 
tion for the number of scores contributing to the means in- 
volved. Irwin" has, in general, described the method. 

A question may, however, be raised as to the validity of 
this procedure of adding sums of squares and degrees of 
freedom from the separate analyses of each series on the 
assumption that all the deviations are, in effect, grouped 
about a common grand mean. Although the tests were equated 





170, J. Irwin, “Mathematical Theorems Involved in the Analyses of Vari- 
ance,” Journal of Royal Statistical Society, XCIV (1931), 284-300. (especially 
pp. 293-296) 


310 


- a 
- 





ee 
— 








o 


-—- _—-— -=-— -— FF — DS 


— -— os = i 


ome 


=—s: hn, 











SS 





INTRA-INDIVIDUAL DIFFERENCES 


with this criticism in mind, the question may best be answered 
by demonstrating that the same result is obtained if a method 
demanding no such assumption is employed. 

First, it should be pointed out that the individual differ- 
ences and trait differences factors are highly significant in each 
of the separate analyses in which they occur. Their total 
would, therefore, of necessity be highly significant. The repe- 
titive variations factor, however, is not invariably significant 
in these separate analyses. (cf. Table 2) Its total has, never- 
theless, been shown by the method of adding sums of squares 
and degrees of freedom to be significant above the | per cent 
level. It is, then, this specific result which requires verifica- 
tion. Fisher’® has described a method appropriate for pooling 
the information from mutually exclusive though similar experi- 
ments. His technique makes it possible to sum the independent 
probabilities which arise from independent experiments by 
utilizing the fact that the log of the probability to the base 
“e’”’ is equal to minus one-half Chi-squared. Two degrees of 
freedom are allowed for each independent comparison or 
probability value; these degrees of freedom and the log values 
are additive. The total may be tested for significance directly 
by entering the Chi-squared tables with the appropriate num- 
ber of degrees of freedom. The hypothesis in the present 
instance is that the repetitive variations and error factors are 
from the same population. The obtained value of Chi-squared 
is highly significant and refutes this hypothesis as shown in 
Table 10. The table also shows that the results obtained by 
this method and by the method of totaling sums of squares 
and degrees of freedom are comparable, since by either pro- 
cedure the obtained probability value exceeds the 1 per cent 
level by approximately the same amount. 

Two problems, then, remain. The first concerns the de- 
termination of the significance of the difference between the 
mean magnitudes of individual and trait differences. The 
second relates to the relative importance of learning in the 





18R, A. Fisher, Statistical Methods for Research Workers (London: Oliver 
and Boyd, 1936, sixth edition), 104-106. 


311 











EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


TABLE 10 


COMPARISON OF METHODS 








RV. _ F = 3.49" 
Err. ratio = 2.5 
1% —F=142 


methods give equivalent results 
SP1s — X? = 130.85 J 
ratio = 2.6’ 

Yo — X*= 50.89 
*Values taken from R.V./Err. in Table 9. 

repetitive variations factor. The former problem was handled 
in the following manner. The per cent of the total variation 
contributed by individual differences was determined for each 
test (cf. Table 1). The per cent of the total variation con- 
tributed by trait differences was determined for each indi- 
vidual (cf. Table 2). These two series of per cents were then 
tested for the significance of the difference between their 
means. Since per cents are distributed as a binomial or Poisson 
distribution, it was necessary to transform the original meas- 
ures before applying the test of significance.’® Fisher and 
Yates*’ have constructed a table of the inverse sine function 
which transforms proportions or per cents to angular degrees 
and normalizes their distribution. Fisher’s ‘“?’ test was ap- 
plied to these transformed values, and the obtained value of 
“?’ was found to be insignificant at even the 50 per cent level. 
It was, thus, assumed that individual differences and trait 
differences were of comparable magnitude. This view that the 
tests are as discrete as the individuals—the specificity view of 
motor skills—is in substantial accord with the published con- 
clusions of such investigators as Perrin, Muscio, Seashore, 





19Clark and Leonard have contributed an excellent discussion of this point. 
Cf. A. Clark and W. H. Leonard, “The Analysis of Variance with Special 
Reference to Data Expressed as Percentages,” Journal of American Society of 
Agronomy, XXXI (1939), 55-66. 

20R, A. Fisher and F. Yates, Statistical Tables for Biological, Agricultural, 
and Medical Research (London: Oliver and Boyd, 1938), p. 90. 


312 











l- 
ba 





INTRA-INDIVIDUAL DIFFERENCES 


Grifitts, and Buxton and Humphreys.*' It likewise agrees 
with the conception of motor abilities propounded by the 
authors of the ““Minnesota Mechanical Ability Tests.’’** These 
last make reference to the specificity view as “the theory of 
unique traits.” 

With respect to the latter problem, then, Table 9 shows 
that the repetitive variations factor accounts for approxi- 
mately 7 per cent of the total variation in the series of analyses 
relative to trait differences. Table 5 shows that the initial 
(2-4) magnitude of the repetitive variations factor is ap- 
proximately 10 times its final (6-8) magnitude. This sug- 
gests, as an estimate, that learning is at least 10 times as 
important a source of variation as are “random” fluctuations 
in the individual's efficiency from day to day in all functions. 

Finally, it may be observed that if individual differences 
and trait differences are of comparable magnitude, and if the 
repetitive variations factor is of significant magnitude, then 
by definition intra-individual differences are greater than inter- 
individual differences. This fact was afirmed by determining 
the mean per cent contribution of individual differences to the 
total variation in the analyses of series one, and of trait dif- 
ferences plus repetitive variations to the total variation in the 
analyses of series two. The two series of per cents were 
appropriately transformed via the inverse sine function and 
the “?’’ test was applied to determine the significance of the 
difference between their means. The obtained value, confirm- 
ing the hypothesis, was significant above the 1 per cent level. 





21F, A. C. Perrin, “An Experimental Study of Motor Ability,” Journal of 
Experimental Psychology, IV (1921), 24-56. 

B. Muscio, “Motor Capacity with Special Reference to Vocational Guid- 
ance,” British Journal of Psychology, XIII (1922), 152-184. 

R. H. Seashore, “Individual Differences in Motor Skills,” Journal of Gen- 
eral Psychology, Il (1930), 38-66. 

C. H. Griffitts, “A Study of Some Motor Ability Tests,” Journal of Applied 
Psychology, XV (1931), 109-125. 

C. Buxton and L. G. Humphreys, “The Effect of Practice Upon Intercorre- 
lations of Motor Skills,’ Science, LXXXI (1935), 441-442. 

22—D. G. Paterson, R. M. Elliott, et al. Minnesota Mechanical Ability Tests 
(Minneapolis: University of Minnesota Press, 1930), p. 586. 


313 








EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


The Conclusions—In this population, and with respect to 
these functions, the following are the conclusions of the 
present investigation. 


(1) Intra-individual differences were greater than inter-indi- 
vidual differences. 


(2) Individual differences and trait differences were of com- 
parable magnitude. 


(3) Repetitive variations were of approximately one-seventh 
to one-eighth the magnitude of individual or trait differ- 
ences. 


(4) Learning accounted for at least 90 per cent of the varia- 
tion assigned to the repetitive variations factor. 


314 





a 
f 


— lied 


——— 














li- 

n- 

-h 

= ' 
} 
} 

ba. 
' 
, 
by 








———— 





NEW TESTS* 


Test for Machinists and Machine Operators, by Joseph Tiffin, H. F. 
Owen, C. C. Stevason, H. G. McComb, and C. D. Hume. 1942. 
An achievement test of technical knowledge for machine shop opera- 
tions. For 12th grade through adult level. Time, approximately 50 
minutes. Machine or self-scored. Price, 18c per copy; specimen set 
25c. Published by Science Research Associates, 1700 Prairie Avenue, 
Chicago, Illinois. 





The Purdue Pegboard, developed by the Purdue Research Foundation. 
1942. A test of manual dexterity and facility for small assembly 
work. For high school through adult level. Time, two to four 
minutes. Price, $9.75. Distributed by Science Research Associates, 
1700 Prairie Avenue, Chicago, Illinois. 





Industrial Training Classification Test, Forms A and B, by Charles 
Lawshe and A. C. Moutoux. 1942. Discriminates between indi- 
viduals likely to profit from industrial training programs and those 
likely to fail. For 12th grade through adult level. Time, 35 min- 
utes. Price, 6c per copy; specimen set 15c. Published by Science 
Research Associates, 1700 Prairie Avenue, Chicago, Illinois. 





Turse-Durost Shorthand Achievement Test, Form A, by Paul L. Turse 
and Walter N. Durost. 1942. Areas sampled are shorthand prin- 
ciples, shorthand penmanship or outline proportions, punctuation, 
paragraphing, sentence structure, and spelling. For first and second 
year shorthand students. ‘Time, approximately 50 minutes. Price, 
$1.10 per package of 25 tests; specimen set 15c. Published by the 
World Book Company, Yonkers-on-Hudson, New York. 





The Behavior Cards, by Ralph M. Stogdill. 1941. Designed for use 
as individual test-interview with delinquent boys and girls. For 
ages 9 to 18. Time, 15 to 30 minutes. Price, $2.50 per complete 
set, including specially constructed box, 150 cards, 25 record sheets, 
and manual of directions. Distributed by the Psychological Cor- 
poration, 522 Fifth Avenue, New York City. 


*Prepared by Jane Gilbert. 


315 





EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Byrd Health Attitude Scale, by Oliver E. Byrd. 1941. Designed to 


measure health attitudes of the group or individual. For 11th grade 
level through college sophomore level. Time, approximately 30 
minutes. Price, $1.75 per package of 25 tests. Published by Stan- 
ford University Press, Stanford University, California. 





Test on the Effects of War, by Lee J. Cronbach. 1942. A survey 


instrument, designed to study morale or confidence of high school 
youth. For high school students. Time, approximately 25 min- 
utes. Price, lc per re-usable test; 5c per answer sheet. Published 
by the State College of Washington, Pullman, Washington. 





Traxler Silent Reading Test, Form 4, by Arthur E. Traxler. Revised, 





1942. Includes rate of reading, story comprehension, word mean- 
ing, and power of comprehension. For grades 7 to 10. ‘Time, ap- 
proximately 50 minutes. Price, 7c per copy; specimen set 30c. 
Published by the Public School Publishing Company, Bloomington, 
Illinois. 





tan- 


vey 
1001 
nin- 


hed 


an- 
ap- 
Oc. 





MEASUREMENT ABSTRACTS* 


Ackerman, Dorothy S. “The Critical Evaluation of the Viennese Tests 
as Applied to 200 New York Infants Six to Twelve Months Old.” 
Child Development, XIII (1942), 41-53. 


Bihler’s Viennese Tests for the measurement of development of 
infants were given to 200 infants for the purpose of evaluating and 
validating them for use with American children. Representative groups 
of subjects were used. The procedure followed was that standardized 
for the tests. The average developmental quotient score was 106.67 
as compared with a score of 100 obtained by Bihler for Viennese chil- 
dren. Split-test reliability coefficients ranging from .92 to .98, for the 
different age groups, were obtained. Suggestions for revising some of 
the items are made, but, on the whole, the test is considered to be a 
valuable, practical instrument for estimating the development of in- 
fants. L. Bouthilet. 





Berger, A. “Test Construction and I.Q. Constancy.” Journal of Ex- 
ceptional Children, VIII (1942), 109-111. 


Although much attention has been given to the effect of such factors 
as changes of environment, schooling, and glandular therapy upon I. Q. 
constancy, little emphasis has been placed upon the defects in the tests 
themselves as a source of inconstancy of the I.Q. This paper discusses 
some of the causes of I. Q. fluctuation. Among those listed are the 
fact that the I. Q. varies according to the particular test used to meas- 
ure it, that the same test given at different age levels may involve the 
use of entirely different types of items, and that the variability of the 
groups upon which the tests were standardized may have been different. 


L. Bouthilet. 





Berger, Arthur and Speevack, Morris. “An Analysis of the Range of 
Testing and Scattering Among Retarded Children on Form M of 
the Revised Stanford-Binet Scale.” Journal of Educational Psy- 
chology, XX XIII (1942), 72-75. 


The authors have found that a large percentage of retarded pupils 
increase their scores on the average 3.14 months of mental age when 
the tests are extended. The rhyme, digits forward and reversed, word 
naming, sentence memory (year XI), response to picture (Messenger 
Boy), and problems of fact are among the items most frequently passed 
beyond the first zero point. Frequent passing of certain items after a 


*Edited by Forrest A. Kingsbury. 


317 








EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


year’s level of complete failures indicates the possibility of inadequate 
scaling for these items. It is suggested that the test should be extended 
at least to the point where two levels of failures have been reached, if it 
is to be an adequate measure in the clinical examination of retarded chil- 
dren. Louise T. Grossnickle. 





Burt, C. The Factors of the Mind: An Introduction to Factor Anal- 
ysis in Psychology. London, Univ. London Press. 1940. pp. xiv+ 
509. 


The Factors of the Mind reviews the field of factor analysis, par- 
ticularly the English versions. 


The logical methods rather than the results of factor analysis are 
discussed in the first section. The primary object of factorial methods 
is neither interpretation, which was Spearman’s original concern, nor 
statistical prediction, which was Thomson’s original concern. The ob- 
ject is description. “Mathematically, a factor is simply an average... 
of certain measurements empirically obtained. Logically, it is simply 
a principle of classification—a principle by which both tests (or traits) 
and the persons tested may be classified.” 


The section following describes the similarities among the various 
types of factor techniques. The last section is “an actual application 
of . . . the problem of temperamental types.’”’ The inverted factor 
technique is given its most complete review in this section. An appendix 
contains working methods and tables for computers. Helen W olfle. 





Canady, H. G., Buxton, C. and Gilliland, A. R. “A Scale for the 
Measurement of the Social Environment of Negro Youth.” Journal 


of Negro Education, XI (1942), 4-13. 


Seventeen environmental factors (social contacts, cultural, educa- 
tional, home, etc.) considered by competent judges important for the 
mental development of Negro youth of high-school age are incorporated 
into an hour’s interview. The subject’s response on each factor is rated 
by the interviewer from 1 to 5 with the aid of a scoring key, yielding 
a total possible score ranging from 17 to 85. The Environmental Inven- 
tory items are less of the socio-economic type than those in the Sims 
scale (with which it correlates .73 + .04); and it has some relation 
(r= .32 + .06) to intelligence, as compared with the Sims scale 
correlation with intelligence, which was found to be .16 + .05. 
F. A. Kingsbury. 





Carter, H. D. “How Reliable are the Common Measures of Difficulty 
and Validity of Objective Test Items?” Journal of Psychology, 
XIII (1942), 31-39. 


318 





a a ee ee ee ee ee 








late 
ded 
f it 
hil- 


al- 
vt 


ire 
ds 
or 


b- 
ly 
3) 


e 


] 


[eS = ee TP 


we 





—— 


MEASUREMENT ABSTRACTS 


Subject-matter tests taken by 200 psychology students were analyzed 
to determine the relative reliability of various measures upon which 
item selection may be based. Results indicated that accurate measures 
of item difficulty may be obtained from a representative group of as few 
as 25 students. The common measure of the power of test items to 
discriminate between good and poor students yielded a reliability coeffi- 
cient of .46. The author concludes that a test may be improved more 
easily by basing selection on a measure of difficulty than on a measure 
of discrimination power. L. Birdsall. 





Crissy, William J. E. and Flanagan, John C. “A Plan for Using 
Punched Cards in Presenting Test Results in Profile Form.” Jour- 
nal of Applied Psychology, XXVI (1942), 94-105. 


The importance of keeping test results in profile form is urged by 
psychologists, counselors, and personnel officers. The authors have 
developed a method for graphic presentation of test results of the Na- 
tional Teacher Examinations of the American Council on Education, 
including a maximum of fifteen scores for each examinee. Scores on the 
tests used are reported on a common scale on which a specific score 
indicates a comparable degree of excellence for any one of the various 
tests. They are reported quickly, inexpensively, and in convenient 
form for permanent filing. The procedures for punching and inter- 
preting the profile card are given in the appendix. K. 8. Yum. 





Dearborn, Walter F. and Rothney, John W. M. Predicting the Child’s 
Development. Cambridge, Sci-Art Publishers. 1941. pp. 360. 


This report is based on the Harvard Growth Study, an investi- 
gation of physical and mental growth. Numerous tests and statistical 
procedures have been applied in an effort to determine constancy or 
variability of growth in intelligence, educational achievement, body size, 
ossification, and other characteristics. Jane Gilbert. 





Deemer, Walter L. “A Method of Estimating Accuracy of Test 
Scoring.” Psychometrika, VII (1942), 65-73. 


When errors of test scoring obey a Poisson frequency law (theo- 
retical considerations suggest that they do), the method described may 
be used for finding the upper fiducial limits of scoring errors per paper. 
A criterion is suggested for establishing tolerance limits on scoring er- 
rors, and a method is given (1) for finding the probability of being 
wrong in the statement that the tolerance limit is being met for a given 


319 








EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


size sample or (2) for finding the size of sample that will make this 
probability not greater than some fixed value. (Courtesy Psycho- 
metrika. ) 





Dodd, Stuart Carter. Dimensions of Society. New York, Macmillan. 
1942. pp. 944. 


This book presents a mathematical approach to society and repre- 
sents an attempt to systematize statistical forms and data relative to 
society. ‘The theory upon which this study is based is as follows: “Any 
quantitatively recorded societal situation (S) can be expressed as a com- 
bination of time (T), space (L), a human population (P), and indi- 
cators (1) of their characteristics . . . each type analyzable into a speci- 
fied number of indices each operationally developed by its exponents 
and each subdivided into a specified number of class intervals and further 
subdivided into a specified number of cases.”” While the field of soci- 
ology is emphasized in this presentation, the methodology should be 
applicable to all quantifiable data in each of the social sciences. Jane 
Gilbert. 





Ezekiel, Mordecai. Methods of Correlation Analysis. New York, 
John Wiley and Sons. 1941. pp. 531. 


Although this book treats statistical procedures largely from the 
economic point of view, it should be of general interest to measurement 
workers. It does not cover the entire field of statistics; rather, it deals 
with the types of relationships between variables. The author has at- 
tempted to bring up to date the interpretation of standard errors and 
to point out the application of the logical limitations to graphic curve 
flexibility. New and speedier methods of calculation and methods of 
estimating reliability of individual estimates are also presented. Jane 


Gilbert. 





Ferguson, George A. “Item Selection by the Constant Process.” Psy- 
chometrika, VII (1942), 19-29. 


This paper relates the constant process used in psychophysics to 
the problem of item selection. Each test item may be described in 
terms of a limen, which is an index of the point at which an item dis- 
criminates, and the standard deviation of the limen, which is an index 
of the “goodness” of discrimination. The method developed may be 
related not only to the description of items but also to the description 
of persons. Thus a person’s ability may be described in terms of a 
limen and its standard deviation. (Courtesy Psychometrika.) 


320 














his 


ho- 


oo ef O&O 


ta 








MEASUREMENT ABSTRACTS 


Gillette, Annette L. ‘Relative Difficulty of Tests Within Each Year 
Level of Revised Stanford-Binet, Form L, Years Six Through 
Twelve.” Journal of Psychology, XII (1941), 125-138. 


“The data (from 506 cases) clearly indicate that within year levels 
there are variations in the difficulty of tests as measured by the per- 
centage passing . . . The tables indicate the differences in difficulty 
of tests within levels and the reliability of these differences.” Each of 
the 42 tests is named and numbered and placed in order of per cent 
passing of the total group. The tables will be of great value to the 
clinician. Helen M. Wolfle. 





Greene, Harry A., Jorgenson, Albert N., and Gerberich, J. Raymond. 
Measurement and Evaluation in the Elementary School. New York, 
Longmans, Green, and Company. 1942. pp. 639. 


This book has been designed as a handbook of measurement for ele- 
mentary school teachers and students of elementary education. Par- 
ticular attention has been given to the problems involved in the con- 
struction, use, improvement, and interpretation of teacher-made exami- 
nations and tests. Important changes and trends in curriculum organi- 
zation, instructional techniques, and in measurement and evaluation 
techniques have been incorporated in this edition, which is a revision 
of an earlier text. 

The authors discuss types of educational and mental tests, the criteria 
of a good examination, construction and use of various types of standard- 
ized tests, the nature and use of intelligence and personality tests, meas- 
urement and remediation in specific academic areas, and finally, the use 
of test results for guidance purposes. While the book is directed at the 
elementary level, it should also be of general interest to measurement 
workers and teachers at all levels, particularly with reference to the dis- 
cussions on test construction and standardization. Jane Gilbert. 





Grossnickle, Louise T. “The Scaling of Test Scores by the Method 
of Paired Comparisons.” Psychometrika, VII (1942), 43-64. 


The purpose of this study is to investigate, by the method of paired 
comparisons, a possible scaling of individuals who have made certain 
test scores, such that the additive property will be satisfied and such that 
a stability in scaling will be maintained—in other words, a scaling such 
that the scaled score of an individual will remain relatively the same 
regardless of the grouping of individuals in which he may be placed. 
The results show that it is possible to utilize psychophysical methods 
in psychological and educational test situations. Among the major find- 
ings are that Case V of the Law of Comparative Judgment is appli- 


321 








EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


cable to the data in this problem, the method of dividing the inter- 
mediate category equally between the greater and the less was the best 
of three possible methods, internal consistency was satisfied, and, finally, 
when a new test of stability was applied, it was found that the distances 
between the hypothetical individuals remain the same. (Courtesy 
Psychometrika.) 





Guest, L. P. “Last vs. Usual Purchase Questions.” Journal of Ap- 
plied Psychology, XXVI (1942), 180-86. 


The use of the questionnaire in market research has led to increased 
interest in the problem of the form of the question to enable the respon- 
dent to answer as easily and correctly as possible. The problem is to 
determine the difference between the two forms, “last purchase” and 
“usual purchase,” in the questionnaire. The writer had two groups 
consisting of 438 college students, representing “last purchase” and 
“usual purchase” groups. Each student was asked to answer a question- 
naire of 24 questions. The results show that the two questions give 
comparable answers for the most part when the results are treated for 
groups rather than individuals. In measuring the brand preferences, 
trends could be established equally well by either one of the two forms. 
K. 8. Yum. 





Guilford, J. P. “A Simple Scoring Weight for Test Items and Its 
Reliability.” Psychometrika, VI (1941), 367-74. 


It is pointed out that the scoring weights for test items should be 
approximations to regression-equation weights. For this reason any 
estimate of reliability of the weight should not be permitted to influence 
the size of the weight but should be used in determining the limit of 
acceptability of an item. A simple approximation weight is recom- 
mended for general use, and an abac is provided for the estimation of it 
when the correlation between item and criterion is the phi coefficient. 
A formula for the standard error of this weight is derived and tables 
of significant and very significant weights are presented in terms of de- 
viation from the median weight. (Courtesy Psychometrika.) 





Helson, Harry. “Multiple-Variable Analysis of Factors Affecting Light- 
ness and Saturation.” American Journal of Psychology, LV (1942), 
46-57. 


Factors affecting judgments of lightness (brightness) and saturation 
were evaluated through the use of analysis of variance. Judgments were 
made on an eleven-point scale running from zero to 10 for each attribute. 


322 











— Ck at On. OR om ok CO 

















MEASUREMENT ABSTRACTS 


All computations are shown and explained in detail. Judgments of satu- 
ration were significantly affected by background (white, gray, or black), 
intensity of illumination, hue, and the interaction of hue and background. 
Background was most important. Judgments of lightness were sig- 
nificantly affected by background, intensity of illumination, and hue. II- 
lumination was most important. Helen M. Wolfe. 





Holliday, Frank. ‘A Survey of an Investigation into the Selection 
of Apprentices for the Engineering Industry.” Occupational Psy- 
chology, XVI (1942), 1-19. 


The use of a battery of intelligence and aptitude tests improved the 
selection of English trade and engineer apprentices. Improvement was 
shown by a decreasing number of failures on national examinations, by 
foremen’s satisfaction with the greater aptitude of their new apprentices, 
and by studies of the correlations between test scores and later success. 
Intelligence scores correlated with later success in mathematics, and 
aptitude scores with success in drawing. High intelligence scores alone 
were insufficient in predicting either the good trade or the good engineer 
apprentices. Helen M. Wolfle. 





Holzinger, Karl J. and Harman, Harry H. Factor Analysis. Chicago, 
Univ. Chicago Press. 1941. pp. 417. 


This book has been written to present the various approaches to 
the problem of factor analysis. The analytic and geometric bases for 
factor analysis are discussed as well as the theoretical development of 
various types of solution. Numerous practical illustrations are cited 
together with complete calculations. Jane Gilbert. 





Jurgensen, Clifford E. “A Two-Dimensional Rating Scale.” American 


Journal of Psychology, LV (1942), 255-60. 


A two-dimensional rating scale developed for use in a boys’ camp 
consists of ten traits or questions, each of which forms a scale represent- 
ing five types of behavior. The first and fifth are apparently two oppo- 
site terms descriptive of the same trait; the middle or third is normal 
or average; and the second and fourth supposedly fall between the aver- 
age and the extremes. The second dimension indicates the frequency 
of each type of behavior in terms of seven different degrees, such as 
constantly, almost always, usually, frequently, sometimes, hardly ever, 
and never. Administration of the scale and the scoring system are de- 


scribed. K. 8. Yum. 
323 


EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Katz, Daniel. “Psychological Tasks in the Measurement of Public 
Opinion.” Journal of Consulting Psychology, V1 (1942), 59-65. 


Polling opinion is useful as background for any successful campaign 
for influencing people. In fact, it is basic to the democratic process. 
In addition to such practical utility, it is important in the development 
of the science of social psychology. Many of the significant problems 
of social psychology which are difficult to handle in the laboratory can 
be profitably approached through the field study which ascertains atti- 
tudes and opinions. The author reviews the existing organizations and 
agencies as well as their type of work, and describes fundamental train- 
ing and equipment for this field of public service. Louise T. Gross- 
nickle. 





Kent, G. H. “Emergency Battery of One-Minute Tests.” Journal 
of Psychology, XIII (1942), 141-157. 


A battery of brief tests is given, suitable for use as a preliminary 
measure in psychiatric examinations or under conditions in which presen- 
tation of longer, more formal tests is not feasible. Five oral tests and 
seven written tests are described. Some of the tests have not been 
standardized, and others are revised forms of previously published tests. 
The value of the tests for use in the military situation is emphasized. 


L. Bouthilet. 





Link, Henry C., and Freiburg, A. D. “The Problem of Validity vs. 
Reliability in Public Opinion Polls.” Public Opinion Quarterly, 
VI (1942), 87-98. 


In spite of the fact that public opinion polls have attained the status 
of a scientific instrument and issues of national and international im- 
portance are being considered with reference to poll results, their use and 
interpretation are subject to error. In order to make possible an eval- 
uation of reliability, a statement of the size and distribution of the 
sample of population interviewed should accompany each poll. High 
reliability, however, does not insure validity. One important check 
on the dangerous tendency to accept poll results uncritically has been 
their validation by periodic elections returns. Validations by compari- 
son with specific purchasing behavior is also feasible. Questions on pub- 
lic attitudes and action should be framed in specific and behavioral terms 
rather than in general, stereotyped language. A discussion of various 
other practical techniques of validation is given, with the conclusion 
that the basic criterion of validity is behavior. L. Bouthilet. 





Marble, Samuel D. “A Performance Basis for Employee Evaluation.” 
Personnel, XVIII (1942), 217-226 


324 








—————_ 














=_> ——__~_—- 





MEASUREMENT ABSTRACTS 





Better efficiency ratings can be secured from rating scales when 
their items deal with actual behavior on the job rather than with person- 
ality traits. After the descriptions of the job items are secured, they are 
evaluated. The relative importance of each behavior item to the job 
in question can be obtained by the psycho-physical method of equal- 
appearing intervals. Only items on which there is agreement among the 
judges are included in the final scale. Such a scale encourages the 
supervising officer to distinguish between the descriptive and evaluative 
function, and makes his task more palatable. Helen M. Wolfle. 





Marshall, M. V. “A Study of the Stanford Scientific Aptitude Test.” 
Occupations, XX (1942), 433-434. 


The test was administered to 47 students at the end of their sopho- 
more year or the beginning of the junior year. Scores were then cor- 
related, by the product-moment method, with the average science grade 
in the freshman and sophomore years, average science grade in the junior 
and senior years, and the average chemistry, physics, and biology grades 
for all four years, respectively. Twenty-five students took the test twice, 
once at the end of the sophomore year and again during the senior year. 
The results show that the test possesses high reliability but rather low 
validity. The author feels, therefore, that its practical uti‘ity with college 
students for the purpose of vocational guidance is open to question. 


K. 8. Yum. 





McNemar, Quinn. “On the Number of Factors.” Psychometrika, 
VII (1942), 9-18. 


A proposed criterion for the number of factors is developed on the 
basis of the similarity between a factorial residual and the partial cor- 
relation coefficient; something is known concerning the sampling error 
of the latter. Instead of computing the residuals as partials, a formula 
is presented for adjusting the standard deviation of the distribution of 
residuals so as to approximate the S.D. of the residuals as partial cor- 
relations. The criterion requires that factors be extracted until the 
adjusted S. D. reaches or falls below 1/\/N.- When tried out on six 
samples drawn from six universes of known factorial description, the 
criterion indicated the correct number of factors each time. The requi- 
sites of situations adequate for such empirical checks are discussed. 
(Courtesy Psychometrika.) 





McQuitty, Louis L. “Conditions Affecting the Validity of Personal- 
ity Inventories: I; II; III.” Journal of Social Psychology, XV 
(1942), 32-52. 


325 











EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


These three articles deal with conditions affecting the validity of 
personality inventories. The method of the study is to compare certain 
conditions affecting personality inventories with analogous conditions 
for intelligence tests as to nature of test items, content, directions to 
subjects, and the scoring of item responses; as to techniques of test con- 
struction, interrelations of item scores or answers, the selection and 
elimination of items, and scaling and scoring; finally, as to the nature 
of individual differences as influenced by both hereditary and environ- 
mental conditions. The author suggests possible ways of increasing the 
validity of the personality inventories. K. S. Yum. 





Owens, W. A., Jr. “A New Technic in Studying the Effects of Prac- 
tice upon Individual Differences.” Journal of Experimental Psy- 


chology, XXX (1942), 180-183. 


R. A. Fisher’s analysis of variance is suggested as a technique for 
obtaining an estimate of the effects of practice upon individual differ- 
ences. The technique was applied to a study of motor skill tests, with 
individual differences and test administrations used as criteria of classi- 
fication. The results show a tendency for individual differences to 
increase slightly, but statistically insignificantly, with practice. This ten- 
dency to remain constant suggests the importance of the initial selection 
program in industry. George W’. Boguslavsky. 





Reid, Seerley. ‘Respondents and Non-respondents to Mail Question- 
naires.” Educational Research Bulletin, XXI1 (1942), 87-96. 


The accuracy of mail questionnaire results is difficult to estimate 
because of partial responses. In a study of the use of radios in Ohio 
schools an analysis was made of the difference in replies between first 
respondents, those responding to a follow-up letter, and a sample of the 
remaining group who responded only after intensive persuasion. Sta- 
tistically significant differences were found between the groups, demon- 
strating that if the replies of the first group, or even the first and second 
groups together, had been used, erroneous and inaccurate conclusions 
would have ensued. Implications of the study for other investigations 
of the same type are that follow-up methods are necessary, that a rep- 
resentative sample of the non-respondents may be used to indicate the 
trend of their answers, and that in cases in which a follow-up question- 
naire cannot be employed, the possibility of error must be recognized. 


L. Bouthilet. 





Rodeheaver, Newton and Grim, Paul R. “Tests in Civics and Citi- 
zenship, Part II.” Social Education, V1 (1942), 222-224. 


326 


























MEASUREMENT ABSTRACTS 





This is the second installment of a bibliography of tests of various 
aspects of knowledge and attitude in the field of government. General 
headings include tests on the Declaration of Independence, the United 
States Constitution, community affairs, current affairs, and attitudes 
and beliefs. The objectives of the test, school grades for which it is 
suited, and a critical comment accompany each title. L. Bouthilet. 





Slater, P. “Notes on Testing Groups of Young Children.” Occupa- 
tional Psychology, XVI (1942), 31-38. 


The basic principle of securing rapport and constancy of testing 
conditions among different groups of subjects, especially among young 
children of different ages, is a very important one. The author is par- 
ticularly concerned with some of the conditions for the administration 
of the N.I.I.P. Group Test 70 on groups of children who are 11, but 
not yet 12 years old, and who are 13, but not yet 14 years old, respec- 
tively. In administering the test, the psychologist should consider the 
particular age group he is testing to secure the psychological condition 
of clear understanding and to meet the types of difficulty that are likely 
to arise. Louise Grossnickle. 





Swineford, Frances. ‘Some Comparisons of the Multiple-Factor and 
the Bi-Factor Methods of Analysis.” Psychometrika, VI (1941), 
375-82. 


Bi-factor and multiple-factor analyses of the same data are com- 
pared in two respects. First, two criteria are suggested for determining 
when the factorization is adequate. This problem being more acute 
for the centroid method than for the bi-factor method, the latter is used 
primarily for comparison only. It is shown also that the omission from 
the simple structure of entries smaller than .10 yields a pattern which is 
a poorer fit to the original correlations than is the bi-factor pattern. Sec- 
ond, the second-order general factor obtained from the intercorrelations 
of the primaries is found to be highly correlated with the general factor 
of the bi-factor pattern. (Courtesy Psychometrika.) 





Toops, Herbert A. ‘Code Numbers as a Means of Scoring Group- 
Administered Performance Test Products.” Journal of Applied 
Psychology, XXVI (1942), 136-50. 


327 





EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


Among the chief obstacles to the establishment of an adequate per- 
formance test program for guidance are the time and skill involved in 
the cost of scoring, and the delay between subsequent tests administered 
using the same equipment. In view of the fact that all mechanical 

performance test products have as a common feature space arrangements 

of movable sub-parts of a whole, and consequently exist in only a lim- 
ited number of ways or patterns of correct and partially correct prod- 
ucts, the author suggests the employment of “Addends” as a means of 
quick and certain identification of such performances. He shows in 
detail how to apply this addend principle by illustrations of fish-pole 
assembly and a bolt-and-washer assembly. K. 8. Yum. 








